Hybrid Approach Using Fuzzy Logic and MapReduce to Achieve Meaningful Used Big Data

  • Authors

    • Ikhlas Almukahel Isra University
    • Wael Alzyadat Isra University
    • Mohamad Alfayomi Isra University
    2019-07-22
    https://doi.org/10.14419/ijet.v7i4.28772
  • Big Data, MapReduce, Meaningful, Predictive, Fuzzy Logic Controller.
  • Big data faces many challenges from different aspects; these challenges are represented in characteristics, such as volume, velocity, variety, and value. Preprocessing and analyzing big data are important issues to acquire quality information toward accurate values for correct decision making. Quality data taxonomy points to two basic actions to ensure that data is meaningful and predictive. Consequently, a hybrid approach using fuzzy logic and MapReduce is utilized to produce a new version of MapReduce which consist of four layers. Data collection is achieved in the first layer. The second layer consist of preprocessing data, where semi-structured data is treated to clean up and obtain the map function to acquire relationships. The third layer includes the application of fuzzy controller as well as classification to generate rules. Finally, in the fourth and last layer, data reduction and classification are carried out to achieve a meaningful and predictive outcome. The result showed the efficiency of the approach through Sensitivity = 80%, Specificity = 86% and F-measure= 2.5 that were validated in TREC conference website. The hybrid approach treating the 4Vs towards achieving meaningful which has positive effect support doctor to take the right decision.

     

     

  • References

    1. [1] Manyika, J., et al., Big data: The next frontier for innovation, competition, and productivity. 2011.

      [2] Xiaofeng, M., C.J.J.o.c.r. Xiang, and development, Big data management: concepts, techniques, and challenges [J]. 2013. 1(98): p. 146-169.

      [3] Jin, X., et al., Significance and challenges of big data research. 2015. 2(2): p. 59-64. https://doi.org/10.1016/j.bdr.2015.01.006.

      [4] Fernández, A., et al., Fuzzy rule-based classification systems for big data with MapReduce: granularity analysis. Advances in Data Analysis and Classification, 2017. 11(4): p. 711-730. https://doi.org/10.1007/s11634-016-0260-z.

      [5] Chen, C.P. and C.-Y.J.I.S. Zhang, Data-intensive applications, challenges, techniques, and technologies: A survey on Big Data. 2014. 275: p. 314-347. https://doi.org/10.1016/j.ins.2014.01.015.

      [6] Tidke, B. and R. Mehta, A Comprehensive Review and Open Challenges of Stream Big Data, in Soft Computing: Theories and Applications. 2018, Springer. p. 89-99. https://doi.org/10.1007/978-981-10-5699-4_10.

      [7] del Río, S., et al., A MapReduce approach to address big data classification problems based on the fusion of linguistic fuzzy rules. 2015. 8(3): p. 422-437. https://doi.org/10.1080/18756891.2015.1017377.

      [8] Hashem, I.A.T., et al., MapReduce: Review and open challenges. 2016. 109(1): p. 389-422. https://doi.org/10.1007/s11192-016-1945-y.

      [9] JovanoviÄ, U., et al., Big-data analytics: a critical review and some future directions. 2015. 10(4): p. 337-355. https://doi.org/10.1504/IJBIDM.2015.072211.

      [10] ABDRABO, M., et al., A Framework For Handling Big Data Dimensionality Based on Fuzzy-Rough Technique. Journal of Theoretical & Applied Information Technology, 2018. 96(4).

      [11] Jin, S., J. Peng, and D. Xie. Towards MapReduce approach with dynamic fuzzy inference/interpolation for big data classification problems. in 2017 IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing (ICCI* CC). 2017. IEEE. https://doi.org/10.1109/ICCI-CC.2017.8109781.

      [12] del Río, S., et al., A MapReduce approach to address big data classification problems based on the fusion of linguistic fuzzy rules. International Journal of Computational Intelligence Systems, 2015. 8(3): p. 422-437. https://doi.org/10.1080/18756891.2015.1017377.

      [13] Al_Zyadat, W.J. and F. Y.Alzyoud, The classification filter techniques by field of application and the results of output. Australian Journal of Basic and Applied Sciences (AJBAS), 2016. 10(15): p. 10.

      [14] Mahmud, S., R. Iqbal, and F. Doctor, Cloud-enabled data analytics and visualization framework for health-shocks prediction. Future Generation Computer Systems, 2016. 65: p. 169-181. https://doi.org/10.1016/j.future.2015.10.014.

      [15] He, Q., et al., Parallel sampling from big data with uncertainty distribution. Fuzzy Sets and Systems, 2015. 258: p. 117-133. https://doi.org/10.1016/j.fss.2014.01.016.

      [16] Haruna, K. and M.A. Ismail. Evaluation Datasets for Research Paper Recommendation Systems. in Data Science Research Symposium 2018. 2018.

      [17] The Text Retrieval Conference (TREC). 2018; Available from trec.nist.gov/evals.html.

      [18] Venables, W.N., D.M. Smith, and R.C. Team, An introduction to R-Notes on R: A programming environment for data analysis and graphics. 2018.

      [19] Holmes, G., A. Donkin, and I.H. Witten, Weka: A machine learning workbench. 1994.

      [20] Wickham, H., J. Hester, and R.J.U.h.C.R.-p.o.p.r.R.p.v. Francois, readr: Read Rectangular Text Data, 2017. 1(0).

      [21] Wickham, H., et al., dplyr: A grammar of data manipulation. 2015. 3.

      [22] Wickham, H.J.U.h.C.R.-p.o.p.t.R.p.v., tidyr: Easily Tidy Data with’spread ()and gather ()’Functions, 2017. 2017. 1: p. 248.

      [23] Denniston, K.J., J.J. Topping, and R.L. Caret, General, organic, and biochemistry. 2004: McGraw-Hill New York.

      [24] Coombes, K.R., K.A. Baggerly, and J.S. Morris, Pre-processing mass spectrometry data, in Fundamentals of Data Mining in Genomics and Proteomics. 2007, Springer. p. 79-102. https://doi.org/10.1007/978-0-387-47509-7_4.

      [25] Verma, C. and R. Pandey, Statistical Visualization of Big Data Through Hadoop Streaming in RStudio, in Handbook of Research on Big Data Storage and Visualization Techniques. 2018, IGI Global. p. 549-577. https://doi.org/10.4018/978-1-5225-3142-5.ch019.

      [26] Sadhana, S.S., S.J.I.J.o.E.T. Shetty, and A. Engineering, Analysis of diabetic data set using hive and R. 2014. 4(7): p. 626-9.

      [27] Bondarenko, I., et al., IDAS: a Windows-based software package for cluster analysis. 1996. 51(4): p. 441-456. https://doi.org/10.1016/0584-8547(95)01448-9.

      [28] Powers, D.M., Evaluation: from precision, recall and F-measure to ROC, informedness, markedness, and correlation. 2011.

  • Downloads

  • How to Cite

    Almukahel, I., Alzyadat, W., & Alfayomi, M. (2019). Hybrid Approach Using Fuzzy Logic and MapReduce to Achieve Meaningful Used Big Data. International Journal of Engineering & Technology, 7(4), 6997-7001. https://doi.org/10.14419/ijet.v7i4.28772