Performance Evaluation of Hadoop in Cloud for Big Data

Mohammed Fakherldin; Ibrahim Aaker Targio Hashem; Abdullah Alzuabi; Faiz Alotaibi

doi:10.14419/ijet.v7i4.15.21363

Article Summary Keywords Abstract References Full Article How to cite

Authors
- Mohammed Fakherldin
- Ibrahim Aaker Targio Hashem
- Abdullah Alzuabi
- Faiz Alotaibi
2018-10-07

https://doi.org/10.14419/ijet.v7i4.15.21363
Cloud computing, Hadoop, MapReduce.
Recent trends in big data have shown that the amount of data continues to increase at an exponential rate. This trend has inspired many researchers over the past few years to explore new research direction of studies related to multiple areas in big data. Hadoop is one of the most popular platforms for big data, thus, Hadoop MapReduce is used to store data in Hadoop distributed file systems. While, cloud computing is considered an excellent candidate for storing and processing the big data. However, processing big data across multiple nodes is a challenging task. The problem is even more complex using virtualized clusters in a cloud computing to execute a large number of tasks. This paper provides a review and analysis of the impact of using physical versus cloud cluster in the processing a large amount of data. This analysis has an impact on the processing in terms of execution time and cost of using each one of them. The result indicates that the use of cloud virtual machines helped better utilize the resources of the host computer.
Â
References
1. [1] Mell, P., & Grance, T. (2011). The NIST definition of cloud computing. http://faculty.winthrop.edu/domanm/csci411/Handouts/NIST.pdf.
  [2] Aceto, G., Botta, A., De Donato, W., & PescapÃ¨, A. (2013). Cloud monitoring: A survey. Computer Networks, 57(9), 2093-2115.
  [3] Yao, Y., Wang, J., Sheng, B., Tan, C., & Mi, N. (2015). Self-adjusting slot configurations for homogeneous and heterogeneous hadoop clusters. IEEE Transactions on Cloud Computing, 5(2), 344-357.
  [4] Apache Hadoop Project Members. Apache Hadoop. https://hadoop.apache.org/.
  [5] Zoll, Q., Zhu, Y., & Feng, D. (2010). A study of self-similarity in parallel I/O workloads. Proceedings of the IEEE 26th Symposium on Mass Storage Systems and Technologies, pp. 1-6.
  [6] White, T. (2012). Hadoop: The definitive guide. O'Reilly Media Inc.
  [7] Ghemawat, S., Gobioff, H., & Leung, S. T. (2003). The Google file system. Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles, pp. 29-43.
  [8] Borthakur, D. (2008). HDFS architecture guide. Hadoop Apache Project, 53, 1-13.
  [9] Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
  [10] Vavilapalli, V. K., Murthy, A. C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S., & Saha, B. (2013). Apache hadoop yarn: Yet another resource negotiator. Proceedings of the ACM 4th Annual Symposium on Cloud Computing, pp. 1-16.
  [11] Sharma, B., Wood, T., & Das, C. R. (2013). Hybridmr: A hierarchical mapreduce scheduler for hybrid data centers. IEEE 33rd International Conference on Distributed Computing Systems, pp. 102-111.
  [12] Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., & Zaharia, M. (2010). A view of cloud computing. Communications of the ACM, 53(4), 50-58.
  [13] Ibrahim, S., Jin, H., Lu, L., He, B., Antoniu, G., & Wu, S. (2012). Maestro: Replica-aware map scheduling for mapreduce. Proceedings of the 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, pp. 435-442.
Downloads
How to Cite
Fakherldin, M., Aaker Targio Hashem, I., Alzuabi, A., & Alotaibi, F. (2018). Performance Evaluation of Hadoop in Cloud for Big Data. International Journal of Engineering & Technology, 7(4.15), 16-18. https://doi.org/10.14419/ijet.v7i4.15.21363
ACM

ACS

APA

ABNT

Chicago

Harvard

IEEE

MLA

Turabian

Vancouver

Download Citation

Endnote/Zotero/Mendeley (RIS)

BibTeX

Performance Evaluation of Hadoop in Cloud for Big Data

Authors

References

Downloads

How to Cite

Published