Metamorphosis of data (small to big) and the comparative study of techniques (HADOOP, HIVE and PIG) to handle big data

  • Authors

    • Rapinder Kaur Chandigarh University
    • Vaishali Chauhan Chandigarh University
    • Urvashi Mittal Chanigarh University
    2018-05-03
    https://doi.org/10.14419/ijet.v7i2.27.11206
  • Big Data, Hadoop, Hive, Map Reduce, Pig.
  • Immoderate amount of data is being generated everyday across the world via miscellaneous sources or fields which create issues to the users. Due to this rapid growth, the crucial issue is to analyse the big data with the help of traditional data processing tactics. Structured data is not the peerless but moreover unstructured data and semi-structured data charge up the supplementary consequences to handle this voluminous data. As in this gigantic bulk of data highly advantageous information is hidden which can be good for what ails the individual, group or organization and for adding up to more sophisticated or valuable decisions. So in order to deal with this many new tools and techniques have been excogitated. These tools can analyse the large volume of data being generated at unprecedented speed. This paper shows the comparative study of some of the data analytics techniques which can untangle the big data analytics issues by examining it in more précised manner. The contrast study of Hadoop, Hive and Pig has been illustrated which covers the working of these techniques.

  • References

    1. [1] C. L. Philip Chen and C.-Y. Zhang (2014), Data-intensive applications, challenges, techniques and technologies: A survey on Big Data, vol. 275, Information Sciences, pp. 314–347. https://doi.org/10.1016/j.ins.2014.01.015.

      [2] National Institute of Standards and Technology (NIST) Special Publication 1500-1, NIST Big Data Interoperability Framework: Volume 1, Definitions Final Version 1.

      [3] Matthias Volk, Sascha Bosse and Klaus Turowski (2017), Providing Clarity on Big Data Technologies: A Structured Literature Review, IEEE 19th Conference on Business Informatics, pp. 388-397.

      [4] Praveena, M., & Kameswara Rao, M. (2018). Survey on Big data analytics in Healthcare Domain. International Journal of Engineering & Technology, 7(2.7), 919-925. doi:http://dx.doi.org/10.14419/ijet.v7i2.7.11097.

      [5] Johnu George et al. (2014), Hadoop MapReduce for Tactical Clouds, 2014, IEEE 3rd International Conference on Cloud Networking.

      [6] Virk, M., Chauhan, V., & Mittal, U. (2018). Analysis and Visualization of Data Assimilating Hive and COGNOS Insight 10.2.2. International Journal of Engineering & Technology, 7(2.6), 318-322. doi:http://dx.doi.org/10.14419/ijet.v7i2.6.11271.

      [7] Varsha B.Bobade (2016), Survey Paper on Big Data and Hadoop, Volume: 03, International Research Journal of Engineering and Technology (IRJET), pp. 861-863, Issue: 01 | Jan-2016.

      [8] D. Laney (2001), 3D data management: Controlling data volume, velocity and variety, vol. 6, META Group Research Note, p. 70.

      [9] V is of Big Data, https://tdwi.org/Articles/2017/02/08/10-Vs-of-Big-Data.aspx?Page=1.

      [10] Kadhar Basha J and Dr. M. Balamurugan (2017), A Review on Hive and Pig, Vol. 3, ISSN (Online): 2456-5717, Special Issue 39, May 2017.

      [11] Hanisah Kamaruzaman, S., Nor Shuhadah Wan Nik, W., Afendee Mohamed, M., & Mohamad, Z. (2018). Design and Implementation of Data-at-Rest Encryption for Hadoop. International Journal of Engineering & Technology, 7(2.15), 54-57. doi:http://dx.doi.org/10.14419/ijet.v7i2.15.11212.

      [12] Apache hadoop, http://hadoop.apache.org/.

      [13] Andrew Pavlo (2009), a Comparison of Approaches to Large-Scale Data Analysis, SIGMOD. https://doi.org/10.1145/1559845.1559865.

      [14] Harshawardhan S. Bhosale and Prof. Devendra P. Gadekar (2014), A Review Paper on Big Data and Hadoop, Volume 4, International Journal of Scientific and Research Publications, Issue 10, October 2014.

      [15] Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler (2010), the Hadoop Distributed File System, Sunnyvale, California USA, IEEE. https://doi.org/10.1109/MSST.2010.5496972.

      [16] HDFS Architecture, http://www.informit.com/articles/article.aspx?p=2460260&seqNum=2.

      [17] Apache Hadoop HDFS Architecture, https://www.edureka.co/blog/apache-hadoop-hdfs-architecture/

      [18] Kadhar Basha J and Dr. M. Balamurugan (2017), A Review on Hive and Pig, Vol. 3, ISSN (Online): 2456-5717, Special Issue 39, May 2017.

      [19] Map Reduce Architecture, http://www.glennklockwood.com/data-intensive/hadoop/overview.html.

      [20] Hive architecture, https://pocfarm.wordpress.com/2016/05/09/working-of-hive/.

      [21] Sunny Kumar and Eesha Goel (2016), Comparative Analysis of MapReduce, Hive and Pig, Vol. 17, an International Journal of Engineering Sciences, January 2016.

      [22] Apache hive, https://cwiki.apache.org/confluence/display/Hive/Design.

      [23] Dr. Urmila R. Pol (2016), Big Data Analysis: Comparison Of Hadoop MapReduce, Pig and Hive, volume 5, International Journal of Innovative Research in Science, Engineering and Technology, Issue 6, June 2016, ISSN: 2319- 8753.

      [24] Apache pig architecture, https://hadoop4all.wordpress.com/.

  • Downloads

  • How to Cite

    Kaur, R., Chauhan, V., & Mittal, U. (2018). Metamorphosis of data (small to big) and the comparative study of techniques (HADOOP, HIVE and PIG) to handle big data. International Journal of Engineering & Technology, 7(2.27), 1-6. https://doi.org/10.14419/ijet.v7i2.27.11206