Implementation of cost effective hierarchical Hadoop cluster–a case study for education

  • Authors

    • N S. Kalyan Chakravarthy
    • N Sudhakar
    • E Srinivasa Reddy
    2018-04-20
    https://doi.org/10.14419/ijet.v7i2.21.12174
  • Hadoop, cluster, map reduce processing, Ed-Media.
  • To equip the younger generation of the province with computing skills and provide them access to a wide variety of modern educational resources such as multimedia based on educational content in government schools and colleges to set a strong foundation at an early stage of their education. Educational Media help to empower educational institutions by altering the way of using Information Communication Technology (ICT). To put this into practice, several challenges need to be addressed. It requires a scalable technological architecture and algorithms to form a cluster in which effective resource sharing like CPU, Memory among multiple nodes, tools to monitor, assess and evaluate data under hierarchical Hadoop cluster is needed. By analyzing text, audios, videos information, Periodic reports will be generated to assist the students, teachers and Government. In this case, a software framework is required to process big data stored in hierarchical nodes. As the architecture of Hadoop, an open source software framework doesn't support processing the data stored in hierarchical nodes. This case study proposes the Hierarchical Hadoop cluster to alter the way of using ICT. The proposed work helps in monitoring, reporting the usage of ICT and also acts a help desk to address the issues of the educational institutions. This establishes a novel communication media by generating reports on text, audio, video information based on analysis.

     

     

  • References

    1. [1] Govindarajan K, Somasundaram TS & Kumar VS, “Continuous clusteringin big data learning analyticsâ€, IEEE Fifth International Conference on Technology for Education (T4E), (2013), pp.61–64.

      [2] Pulamolu MKK, “A novel resource allocation using dynamic heterogeneity priority based flow shop algorithm in yarnâ€, Alexandria Engineering Journal, (2017).

      [3] Bu X, Rao J & Xu CZ, “Coordinated self-configuration of virtual machines and appliances using a model-free learning approachâ€, IEEE transactions on parallel and distributed systems, Vol.24, No.4, (2013), pp.681–690.

      [4] Stavrinides GL, Duro FR, Karatza HD, Blas JG & Carretero J, Different aspects of workflow scheduling in large-scale distributed systemsâ€, Simulation Modelling Practice and Theory, Vol.70 (2017), pp.120–134.

      [5] Guo Y, Bland W, Balaji P & Zhou X, “Fault tolerant mapreduce-mpi for hpc clustersâ€, Proceedings of the International Conference for High Performance Computing Networking, Storage and Analysis, (2015).

      [6] Guo Y, Rao J, Jiang C & Zhou X, “Flexslot: Moving hadoop into the cloud withflexible slot managementâ€, SC14: International Conference for High Performance Computing, Networking, Storage and Analysis, (2014), pp. 959–969.

      [7] Cheng D, Rao J, Guo Y, Jiang C & Zhou X, “Improving performance of heterogeneous map reduce clusters with adaptive task tuningâ€, IEEE Transactions on Parallel and Distributed Systems, Vol.28, No.3, (2017), pp.774–786.

      [8] Tang S, Lee BS, He B & Liu H, “Long-term resource fairness: Towards economic fairness on pay-as-you-use computing systemsâ€, Proceedings of the 28th ACM international conference on Supercomputing, (2014), pp. 251–260.

      [9] Lin J, Liang F, Lu X, Zha L & Xu Z, “Modeling and designing fault-tolerance mechanisms for mpi-based map reduce data computing frameworkâ€, IEEE First International Conference on Big Data Computing Service and Applications (Big Data Service), (2015), pp. 176–183.

      [10] Moschakis IA & Karatza HD, “Multi-criteria scheduling of bag-of-tasks applications on heterogeneous interlinked clouds with simulated annealingâ€, Journal of Systems and Software, Vol.101, (2015), pp.1–14.

      [11] Wang K, Liu N, Sadooghi I, Yang X, Zhou X, Li T, Lang M, Sun XH & Raicu I, “Overcoming hadoop scaling limitations through distributed task executionâ€, IEEE International Conference on Cluster Computing (CLUSTER), (2015), pp.236–245.

      [12] Difallah DE, Demartini G & Cudré-Mauroux P, “Scheduling human intelligence tasksin multi-tenant crowd-powered systemsâ€, Proceedings of the 25th International Conference on World Wide Web, International World Wide Web Conferences Steering Committee, (2016), pp.855–865.

      [13] Stavrinides GL & Karatza HD, “Scheduling real-time parallel applications in saas clouds in the presence of transient software failuresâ€, International Symposium on Performance Evaluation of Computer and Telecommunication Systems (SPECTS), (2016), pp. 1–8.

      [14] Yao Y, Wang J, Sheng B, Tan CC & Mi N, “Self-adjusting slot configurations for homogeneous and heterogeneous hadoop clustersâ€, IEEE Transactions on Cloud Computing, Vol.5, No.2, (2017), pp.344–357.

      [15] Cheng D, Guo Y & Zhou X, “Self-tuning batching with dvfs for improving performance and energy efficiency in serversâ€, IEEE 21st International Symposium on Modeling, Analysis & Simulation of Computer and Telecommunication Systems (MASCOTS), (2013), pp. 40–49.

      [16] Zhou F, Pham H, Yue J, Zou H & Yu W, “Sfmapreduce: An optimized map reduce framework for small filesâ€, IEEE International Conference on Networking, Architecture and Storage (NAS), (2015), pp. 23–32.

      [17] Kambatla K, Pathak A & Pucha H, “Towards optimizing hadoop provisioning in the cloudâ€, HotCloud, Vol.9, (2009).

      [18] Sharma B, Wood T & Das CR, “Hybridmr: A hierarchical mapreduce scheduler forhybrid data centersâ€, IEEE 33rdInternational Conference on Distributed Computing Systems (ICDCS), (2013), pp.102–111.

      [19] Nair S & Mehta J, “Clustering with apache hadoopâ€, Proceedings of the International Conference & Workshop on Emerging Trends in Technology, (2011), pp. 505–509.

      [20] Pulamolu KK, Bhavani T & Subramanian DV, “Intra-Tenant resource sharing in yarn based on weighted arithmetic meanâ€, International Conference on Networks & Advances in Computational Technologies (NetACT), (2017), pp.262-265.

      [21] Pulamolu MKK, “An efficient resource optimization in intra-tenant heterogeneous hadoop clusterâ€, IEEE Conference, International Conference on Intelligent Computing and Control Systems–ICCS, (2017).

      [22] Subramanian DV & Kumar KP, “Fuzzy based modeling for an effective it security policy managementâ€, SAI Computing Conference (SAI), (2016), pp.173–181.

      [23] Ibrahim AH, Faheem HEDM, Mahdy YB & Hedar AR, “Resource allocation algorithm for gpus in a private cloudâ€, International Journal of Cloud Computing, Vol.5, No.1-2, (2016), pp.45–56.

      [24] Al-Ayyoub M, Daraghmeh M, Jararweh Y & Althebyan Q, “Towards improving resource management in cloud systems using a multi-agent frameworkâ€, International Journal of Cloud Computing, Vol.5, No.1-2, (2016), pp.112–133.

      [25] Subramanian DV, Geetha A, Mehata K & Hussain KM, “Kmsystem evaluation using four dimensional metric model, database and restful resourcesâ€, International Journal on Web Service Computing, Vol.3, No.3, (2012).

      [26] Stavrinides GL & Karatza HD, “Scheduling different types of applications in a saas cloudâ€, Proceedings of the 6th International Symposium on Business Modeling and Software Design (BMSDâÅâ„¢16), (2016), pp. 144–151.

      [27] Li B, Zhao H & Lv Z, “Parallel isodata clustering of remote sensing images based on map reduceâ€, International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery(CyberC), (2010), pp.380–383.

  • Downloads

  • How to Cite

    S. Kalyan Chakravarthy, N., Sudhakar, N., & Srinivasa Reddy, E. (2018). Implementation of cost effective hierarchical Hadoop cluster–a case study for education. International Journal of Engineering & Technology, 7(2.21), 210-216. https://doi.org/10.14419/ijet.v7i2.21.12174