An improved hadoop load rebalancer
-
https://doi.org/10.14419/ijet.v7i2.27.11775
Received date: April 20, 2018
Accepted date: May 28, 2018
Published date: August 6, 2018
-
HDFS, Load Balancer, Rebalance, Scheduler, Spark, Yarn -
Abstract
Hadoop has taken an important space in the market as a result of quick growth of data. Load rebalancing in Hadoop is an area of major concern due to the unpredictable nature of tasks, new nodes added to cluster and node computing capacities. A load rebalancer that is efficient can help to improve the performance and reduce computation time. Load rebalancer and schedulers are used interchangeably in many cases. The main idea of this paper is to explore how load balancers / schedulers work in case of native Hadoop also included insights from some of the works, which identify and addresses the problems around schedulers and rebalancers. In this paper, an Improved Hadoop Load Re-balancer adopts a strategy to move the task to the node which has replica, which is faster and is topologically closer, which reduces the network congestion and execution time of Hadoop.
-
References
- Apache. Welcome to ApacheTM HadoopR. 10. July 2013
- Kwon, Y., et al. "A study of skew in mapreduce applications." Open Cirrus Summit (2011).
- HDFS Architecture guide - https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html.
- Tyagi, Sajal, and Shipra Saraswat. "Different Scheduling Options in YARN." In Microelectronics and Telecommunication Engineering (ICMETE), 2016 International Conference on, pp. 190-196. IEEE, 2016.
- Hadoop’s Fair Scheduler. - https://hadoop.apache.org/docs/r1.2.1/fair_scheduler.
- Chauhan, Jagmohan, Dwight Makaroff, and Winfried Grassmann. "The impact of capacity scheduler configuration settings on mapre-duce jobs." In Cloud and Green Computing (CGC), 2012 Second In-ternational Conference on, pp. 667-674. IEEE, 2012.
- Hadoop’s Capacity Scheduler.- http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html.
- Kulkarni, Amogh Pramod, and Mahesh Khandewal. "Survey on Hadoop and Introduction to YARN." International Journal of Emerging Technology and Advanced Engineering 4, no. 5 (2014): 82-87.
- Yoo, Dongjin, and Kwang Mong Sim. "A comparative review of job scheduling for MapReduce." In Cloud Computing and Intelligence Systems (CCIS), 2011 IEEE International Conference on, pp. 353-358. IEEE, 2011.
- Rao, B. Thirumala, and L. S. S. Reddy. "Survey on improved scheduling in Hadoop MapReduce in cloud environments."arXiv preprint arXiv:1207.0780 (2012).
- Islam, Nusrat Sharmin, Md Wasi-ur-Rahman, Xiaoyi Lu, and Dha-baleswar K. DK Panda. "Efficient data access strategies for Hadoop and spark on HPC cluster with heterogeneous storage." In Big Data (Big Data), 2016 IEEE International Conference on, pp. 223-232. IEEE, 2016.
- Hadoop Load Rebalancer is on demand - https://issues.apache.org/jira/browse/HADOOP-1652.
- Thirumala Rao, B., Susmitha, M., Swathi, T., & Akhil, G. (2018). “Implementation of Hybrid Scheduler in Hadoop”, International Journal of Engineering & Technology, 7(2.7), 868-871.
- S. Kalyan Chakravarthy, N., Sudhakar, N., & Srinivasa Reddy, E. (2018). “Implementation of cost effective hierarchical Hadoop clus-ter–a case study for education”, International Journal of Engineering & Technology.
- Sujatha, J., & Meena, K. (2018). “A vibrant data placement approach for map reduce in diverse environments”, International Journal of Engineering & Technology, 7(2.4), 20-22.
- Nagalakshmi, M., Surya Prabha, I., & Anil, K. (2017). “Bigdata im-plementation of apriori algorithm for handling voluminous data-sets”. International Journal of Engineering & Technology, 7(1.5), 217-220.
-
Downloads
-
How to Cite
J, G., Bhaskar N, U., & Reddy P, C. (2018). An improved hadoop load rebalancer. International Journal of Engineering and Technology, 7(2.27), 109-112. https://doi.org/10.14419/ijet.v7i2.27.11775
