Effective classification of diabetes using big data analytics

 
 
 
  • Abstract
  • Keywords
  • References
  • Abstract


    Diabetes Miletus (DM) is a non-communicable disease which has affected more people in India. According to the recent survey, Diabetes Miletus stands at fourth place in the world with India alone accounting to around 50 million. Diabetes Miletus is classified as Type 1 and Type 2 diabetes respectively. This disease may prolong for decades and consequently lead to chronic complications such as foot ulceration, neuropathy, retinopathy and nephropathy. Hospitals produce huge amount of patient data which is stored in the database in a structured or unstructured form. This data must be analyzed using automated tools to extract the knowledge which can be used to classify the diabetic data of the patient and provide appropriate treatment at early stages. Thus, helps in improving the standard of health care in India. The existing systems for analysis of diabetes data takes more time, inaccurate and cannot handle large amount of data. In order to overcome this drawback, automated method is proposed in this paper to handle large amount of diabetes data and to classify it as Type1 and Type2. The proposed method uses Hadoop environment coupled with Map Reduce technique to handle large amount of data. Support Vector Machine (SVM) algorithm is used for classification of diabetes into Type 1, Type 2 and Normal. The experiment is carried out on data ranging from 100 MB to 2 GB. Once the data is classified into Type 1 and Type 2, similar data can be retrieved from the hospital database. Based on this result, effective treatment can be provided to the patient.

     

     

     



  • Keywords


    Diabetes Miletus; Support Vector Machine; Hadoop; Map Reduce.

  • References


      1. [1] American Diabetes Association. "Diagnosis and classification of diabetes mellitus." Diabetes care 33, no. Suppl 1 (2010): S62.

        [2] Treece, K. A., R. M. Macfarlane, N. Pound, F. L. Game, and W. J. Jeffcoate. "Validation of a system of foot ulcer classification in diabetes mellitus." Diabetic medicine 21, no. 9 (2004): pp. 987-991. https://doi.org/10.1111/j.1464-5491.2004.01275.x.

        [3] http://www.intel.com/content/www/us/en/healthcare-it/bigger-data-better-healthcare-idc-insights-white- paper.html

        [4] Muni Kumar, N., and R. Manjula. "Role of Big data analytics in rural health care-A

        [5] Step towards svasth bharath." International Journal of Computer Science and Information Technologies 5, no. 6 (2014): 7172-7178.

        [6] Rajesh, K., and V. Sangeetha. "Application of data mining methods and techniques for diabetes diagnosis." International Journal of Engineering and Innovative Technology (IJEIT) 2, no. 3 (2012).

        [7] Zolfaghar, Kiyana, Nele Verbiest, Jayshree Agarwal, Naren Meadem, Si-Chi Chin, Senjuti Basu Roy, Ankur Teredesai, David Hazel, Paul Amoroso, and Lester Reed. "Predicting risk-of-readmission for congestive heart failure patients: A multi-layer approach." arXiv preprint arXiv: 1306.2094 (2013).

        [8] Iyer, Aiswarya, S. Jeyalatha, and Ronak Sumbaly. "Diagnosis of diabetes using classification mining techniques." arXiv preprint arXiv: 1502.03774 (2015).

        [9] Eswari, T., P. Sampath, and S. Lavanya. "Predictive methodology for diabetic data analysis in big data." Procedia Computer Science 50 (2015): 203-208 https://doi.org/10.1016/j.procs.2015.04.069.

        [10] Bhat, Veena H., Prasanth G. Rao, S. Krishna, P. Deepa Shenoy, K. R. Venugopal, and Lalit M. Patnaik. "An efficient framework for prediction in healthcare data using soft computing techniques." In International Conference on Advances in Computing and Communications. Springer, Berlin, Heidelberg, 2011, pp. 522-532. https://doi.org/10.1007/978-3-642-22720-2_55.

        [11] Christy, A., G. Meera Gandhi, and S. Vaithyasubramanian. "Cluster based outlier detection algorithm for healthcare data." Procedia Computer Science 50 (2015): 209-215 https://doi.org/10.1016/j.procs.2015.04.058.

        [12] Aljumah, Abdullah A., Mohammed Gulam Ahamad, and Mohammad Khubeb Siddiqui. "Application of data mining: Diabetes health care in young and old patients." Journal of King Saud University-Computer and Information Sciences 25, no. 2 (2013): 127-136. https://doi.org/10.1016/j.jksuci.2012.10.003.

        [13] Sadhana, Savitha Shetty, and S. Shetty. "Analysis of diabetic data set using hive and R." International Journal of Emerging Technology and Advanced Engineering 4, no. 7 (2014): 626-9.

        [14] Maniruzzaman, Md, Nishith Kumar, Md Menhazul Abedin, Md Shaykhul Islam, Harman S. Suri, Ayman S. El-Baz, and Jasjit S. Suri. "Comparative approaches for classification of diabetes mellitus data: Machine learning paradigm." Computer methods and programs in biomedicine 152 (2017): 23-34. https://doi.org/10.1016/j.cmpb.2017.09.004.

        [15] Nilashi, Mehrbakhsh, Othman Bin Ibrahim, Abbas Mardani, Ali Ahani, and Ahmad Jusoh. "A soft computing approach for diabetes disease classification." Health Informatics Journal (2016):1460458216675500.

        [16] Alberti, Kurt George Matthew Mayer, and PZ ft Zimmet. "Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: diagnosis and classification of diabetes mellitus. Provisional report of a WHO consultation." Diabetic medicine 15, no. 7 (1998): 539-553. https://doi.org/10.1002/(SICI)1096-9136(199807)15:7<539::AID-DIA668>3.0.CO;2-S.

        [17] American Diabetes Association. "Diagnosis and classification of diabetes mellitus." Diabetes care 33, no. Suppl 1 (2010): S62.

        [18] Treece, K. A., R. M. Macfarlane, N. Pound, F. L. Game, and W. J. Jeffcoate. "Validation of a system of foot ulcer classification in diabetes mellitus." Diabetic medicine 21, no. 9 (2004): pp. 987-991. https://doi.org/10.1111/j.1464-5491.2004.01275.x.

        [19] White, Tom. Hadoop: The Definitive Guide. " O'reilly Media, Inc.", 2012

        [20] Capriolo, Edward, Dean Wampler, and Jason Rutherglen. Programming Hive: Data Warehouse and Query Language for Hadoop. “O’reilly Media, Inc.", 2012.

        [21] Eadline, Douglas. Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem. Addison-Wesley Professional, 2015.

        [22] Zikopoulos, Paul, and Chris Eaton. Understanding big data: Analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media, 2011.

        [23] Marjanović, Miloš, Miloš Kovačević, Branislav Bajat, and Vít Voženílek. "Landslide susceptibility assessment using SVM machine learning algorithm." Engineering Geology 123, no. 3 (2011): 225-234. https://doi.org/10.1016/j.enggeo.2011.09.006.

        [24] Provost, Foster, and Tom Fawcett. "Data science and its relationship to big data and data-driven decision making." Big data 1, no. 1 (2013): 51-59. https://doi.org/10.1089/big.2013.1508.

        [25] Bellamy, Leanne, Juan-Pablo Casas, Aroon D. Hingorani, and David Williams. "Type 2 diabetes mellitus after gestational diabetes: a systematic review and meta-analysis." The Lancet373, no. 9677 (2009): 1773-1779. https://doi.org/10.1016/S0140-6736(09)60731-5.


 

Article ID: 14453
 
DOI: 10.14419/ijet.v7i4.14453




Copyright © 2012-2015 Science Publishing Corporation Inc. All rights reserved.