Comprehensive study on ensemble classification for medical applications

  • Authors

    • Rosaida Rosly
    • Mokhairi Makhtar
    • Mohd Khalid Awang
    • Mohd Isa Awang
    • Mohd Nordin Abdul Rahman
    • Hairulnizam Mahdin
    2018-04-06
    https://doi.org/10.14419/ijet.v7i2.14.12822
  • Classification, Single Classification, Ensemble Methods, Medical Application.
  • The aims of this paper were to provide a comprehensive review of classification techniques and their alternative approaches in data mining. Classification is a data mining technique that assigns categories to a collection of data to aide in more accurate predictions and analyses. It is one of the several methods intended to make the analysis of very large datasets effective. The goal of classification is to accurately predict the target class for each case in the data. One of the classification approaches is the ensemble method. In recent years, the usage of ensemble method in medical application has been increasing. Not only in medical areas, it can also help researchers to solve modem problems in many fields like machine learning, data mining and other related areas.

     

     

  • References

    1. [1] Abraham, R., Simha, J. B., & Iyengar, S. S. (2007). Medical data mining with a new algorithm for feature selection and naive bayesian classifier. 10th International Conference on Information Technology (ICIT 2007), 44-49. https://doi.org/10.1109/ICIT.2007.41
      [2] Abraham, R., Simha, J. B., & Iyengar, S. S. (2009). Effective discretization and hybrid feature selection using na￯ve bayesian classifier for medical data mining. International Journal of Computational Intelligence Research, 5(2), 116-129. https://doi.org/10.5019/j.ijcir.2009.175
      [3] Adhikari, R., & Agrawal, R. K. (2013). A homogeneous ensemble of artificial neural networks for time series forecasting. International Journal of Computer Applications, 32(7), 8. https://doi.org/10.5120/3913-5505
      [4] Al-Aidaroos, K. M., Bakar, A. A., & Othman, Z. (2012). Medical data classification with Naive Bayes approach. Information Technology Journal. https://doi.org/10.3923/itj.2012.1166.1174
      [5] Al Aidaroos, K. M., Bakar, A. A., & Othman, Z. (2012). Medical data classification with naive bayes approach. Information Technology Journal, 11(9), 1166-1174. https://doi.org/10.3923/itj.2012.1166.1174
      [6] Awang, M. K., Makhtar, M., Rahman, M. N. A., & Deris, M. M. (2016). A new soft set based pruning algorithm for ensemble method. Journal of Theoretical and Applied Information Technology, 88(3), 384-391.
      [7] Bashir, S., Qamar, U., & Khan, F. H. (2016). IntelliHealth: A medical decision support application using a novel weighted multi-layer classifier ensemble framework. Journal of Biomedical Informatics, 59, 185-200. https://doi.org/10.1016/j.jbi.2015.12.001
      [8] Chou, Y., & Shapiro, L. G. (2003). A Hierarchical Multiple Classifier Learning Algorithm ï¾£. Springer, 9-12.
      [9] Datia, N., Santos, V., & Pato, M. P. (2014). Ensemble feature ranking applied to medical data. ScienceDirect, 17, 223-230. https://doi.org/10.1016/j.protcy.2014.10.232
      [10] David, J. M., & Balakrishnan, K. (2010). Significance of classification techniques in prediction of learning disabilities.
      [11] Dora, L., Agrawal, S., Panda, R., & Abraham, A. (2017). Optimal breast cancer classification using Gauss-Newton representation based algorithm. Expert Systems with Applications, 85, 134-145. https://doi.org/10.1016/j.eswa.2017.05.035
      [12] Edbert R., R., & Prasadh, K. (2013). Multi-classifier framework for medical image analysis using mutual information criterion. International Journal of Computer Applications, 61(4), 15-22.
      [13] El-bohy, A. M. S., Hashad, A. I., & Taha, H. S. (2015). Performance evaluation of hepatitis diagnosis using single and multi-classifiers fusion. International Journal of Engineering Research & Technology (IJERT), 4(4), 293-298.
      [14] Elish, M., Helmy, T., & Hussain, M. (2013). Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Mathematical Problems in, 2013.
      [15] Fargeas, A., Albera, L., Kachenoura, A., Dr?an, G., Ospina, J.-D., Coloigner, J., ï¾… Acosta, O. (2015). On feature extraction and classification in prostate cancer radiotherapy using tensor decompositions. Medical Engineering & Physics, 37(1), 126-131. https://doi.org/10.1016/j.medengphy.2014.08.009
      [16] Fayyad, U., Piatetsky-Shapiro, G., & Smyth, P. (1996). From data mining to knowledge discovery in databases. AI Magazine, 37-54. https://doi.org/10.1145/240455.240463
      [17] Gashler, M., Giraud-Carrier, C., & Martinez, T. (2008). Decision tree ensemble: Small heterogeneous is better than large homogeneous. 2008 Seventh International Conference on Machine Learning and Applications, 900-905. https://doi.org/10.1109/ICMLA.2008.154
      [18] Ge, G., & Wong, G. W. (2008). Classification of premalignant pancreatic cancer mass-spectrometry data using decision tree ensembles. BMC Bioinformatics, 9, 275.
      [19] Gunawan, T. S., Yaacob, I. Z., Kartiwi, M., Ismail, N., Za'bah, N. F., & Mansor, H. (2017). Artificial neural network based fast edge detection algorithm for mri medical images. Indonesian Journal of Electrical Engineering and Computer Science, 7(1), 123-130. https://doi.org/10.11591/ijeecs.v7.i1.pp123-130
      [20] Joseph, L., Reinhold, C., & Karlik, S. (2005). Fundamentals of clinical research for radiologists. AJR, (April), 1057-1064.
      [21] Koh, H. C., & Tan, G. (2005). Data mining applications in healthcare. Journal of Healthcare Information Management?: JHIM, 19(2), 64-72.
      [22] kumar, S. U., & Inbarani, H. H. (2015). A novel neighborhood rough set based classification approach for medical diagnosis. Procedia Computer Science, 47, 351-359. https://doi.org/10.1016/j.procs.2015.03.216
      [23] Laudanski, A., Brouwer, B., & Li, Q. (2015). Activity classification in persons with stroke based on frequency features. Medical Engineering & Physics, 37(2), 180-186. https://doi.org/10.1016/j.medengphy.2014.11.008
      [24] Lebedev, A. V, Westman, E., Westen, G. J. P. Van, Kramberger, M. G., Lundervold, A., & Aarsland, D. (2014). NeuroImage?: Clinical random forest ensembles for detection and prediction of Alzheimer 3 s disease with a good between-cohort robustness ?, 6, 115-125. https://doi.org/10.1016/j.nicl.2014.08.023
      [25] Mahila, S. P., & Pradesh, A. (2012). Ensemble decision tree classifier for breast cancer data. International Journal of Information Technology Convergence and Services (IJITCS), 2(1), 17-24.
      [26] Makhtar, M., Longzhi Yang, Daniel Neagu, & Mick Ridley. (2012). Contributions to ensembles of models for predictive toxicology applications. University of Bradford.
      [27] McCowan, I. A., Moore, D. C., Nguyen, A. N., Bowman, R. V., Clarke, B. E., Duhig, E. E., & Fry, M.-J. (2007). Collection of cancer stage data by classifying free-text medical reports. Journal of the American Medical Informatics Association, 14(6), 736-745. https://doi.org/10.1197/jamia.M2130
      [28] Meng, X.-H., Huang, Y.-X., Rao, D.-P., Zhang, Q., & Liu, Q. (2013). Comparison of three data mining models for predicting diabetes or prediabetes by risk factors. The Kaohsiung Journal of Medical Sciences, 29(2), 93-99. https://doi.org/10.1016/j.kjms.2012.08.016
      [29] Murino, L., Granata, D., Carfora, M. F., Selvan, S. E., Alfano, B., Amato, U., & Larobina, M. (2014). Evaluation of supervised methods for the classification of major tissues and subcortical structures in multispectral brain magnetic resonance images. Computerized Medical Imaging and Graphics, 38(5), 337-347. https://doi.org/10.1016/j.compmedimag.2014.03.003
      [30] Nandhini, M., & Scholar, P. D. (2016). Boosting and Meta-Learning Techniques for Distributed Data Mining on Electronic Medical Datasets. International Journal of Computer Technology & Applications, 7(June), 403-410.
      [31] Natta, G., Pino, P., Paolo, C., Danusso, F., Mantica, E., Mazzanti, G., & Moraglio, G. (1955). Heterogeneous ensemble classifiers authors: Journal of American Chemical Society, 77(1415), 1708.
      [32] Nilashi, M., Ibrahim, O., Ahmadi, H., & Shahmoradi, L. (2017). A knowledge-based system for breast cancer classification using fuzzy logic method. Telematics and Informatics, 34(4), 133-144. https://doi.org/10.1016/j.tele.2017.01.007
      [33] Ochs, R., Goldin, J., Abtin, F., Kim, H., Brown, K., Batra, P., Brown, M. (2007). Automated classification of lung bronchovascular anatomy in CT using AdaBoost. Medical Image Analysis, 11(3), 315-324. https://doi.org/10.1016/j.media.2007.03.004
      [34] Page, A., Turner, J., Mohsenin, T., & Oates, T. (2014). Comparing raw data and feature extraction for seizure detection with deep learning methods. The Twenty-Seventh International ï¾…, 284-287.
      [35] Panthong, R., & Srivihok, A. (2015). Wrapper feature subset selection for dimension reduction based on ensemble learning algorithm. Procedia - Procedia Computer Science, 72, 162-169. https://doi.org/10.1016/j.procs.2015.12.117
      [36] Partalas, I., Tsoumakas, G., Hatzikos, E. V, & Vlahavas, I. (2008). Greedy regression ensemble selection?: Theory and an application to water quality prediction. Information Sciences Journal, 178, 3867-3879. https://doi.org/10.1016/j.ins.2008.05.025
      [37] Pereira, C. S., Alexandre, A., & Mendon, A. M. (2006). A multiclassifier approach for lung nodule classification. ICIAR, 612-623.
      [38] Perveen, S., Shahbaz, M., Guergachi, A., & Keshavjee, K. (2016). Performance analysis of data mining classification techniques to predict diabetes. Procedia - Procedia Computer Science, 82(March), 115-121. https://doi.org/10.1016/j.procs.2016.04.016
      [39] Phyu, T. N. (2009). Survey of classification techniques in data mining. In Proceedings of the International MultiConference of Engineers and Computer Scientists (Vol. I).
      [40] Piliouras, N., Kalatzis, I., Dimitropoulos, N., & Cavouras, D. (2004). Development of the cubic least squares mapping linear-kernel support vector machine classifier for improving the characterization of breast lesions on ultrasound. Computerized Medical Imaging and Graphics, 28(5), 247-255. https://doi.org/10.1016/j.compmedimag.2004.04.003
      [41] Pombo, N., Garcia, N., & Bousson, K. (2017). Classification techniques on computerized systems to predict and/or to detect Apnea: A systematic review. Computer Methods and Programs in Biomedicine, 140, 265-274. https://doi.org/10.1016/j.cmpb.2017.01.001
      [42] Prashanth, R., Dutta Roy, S., Mandal, P. K., & Ghosh, S. (2016). High-accuracy detection of early Parkinson's disease through multimodal features and machine learning. International Journal of Medical Informatics, 90, 13-21. https://doi.org/10.1016/j.ijmedinf.2016.03.001
      [43] Rahman, A., & Tasnim, S. (2014). Ensemble classifiers and their applications?: A review. International Journal of Computer Trends and Technology (IJCTT), 10(1), 31-35.
      [44] Rokach, L. (2005). Ensemble methods for classification. Data Mining and Knowledge Discovery Handbook, 957-980. https://doi.org/10.1007/0-387-25465-X_45
      [45] Rokach, L. (2009). Taxonomy for characterizing ensemble methods in classification tasks?: A review and annotated bibliography. Elsevier.
      [46] Rokach, L. (2010). Ensemble-based classifiers. Artificial Intelligence Review, 33(1-2), 1-39. https://doi.org/10.1007/s10462-009-9124-7
      [47] Rosly, R. (2016). A new hybrid multi-classifier models to improve the accuracy of water quality application. Universiti Sultan Zainal Abidin, Terengganu, Malaysia.
      [48] Rosly, R., Makhtar, M., Awang, M. K., A Rahman, M. N., & Deris, M. M. (2015). Multi-classifier models to improve accuracy of water quality application. Proceedings of the International Conference on Telecommunication, Electronic and Computer Engineering (ICTEC '15).
      [49] Salama, G. I., Abdelhalim, M. B., & Zeid, M. A. (2012). Breast cancer diagnosis on three different datasets using multi-classifiers. International Journal of Computer and Information Technology (2277 - 0764), 1(1), 36-43.
      [50] Saraswathi, D., & Srinivasan, E. (2017). Mammogram analysis using league championship algorithm optimized ensembled FCRN classifier. Indonesian Journal of Electrical Engineering and Computer Science, 5(2), 451. https://doi.org/10.11591/ijeecs.v5.i2.pp451-461
      [51] Sasikala, S., Balamurugan, S. A. alias, & Geetha, S. (2015). A novel feature selection technique for improved survivability diagnosis of breast cancer. Procedia Computer Science, 50, 16-23. https://doi.org/10.1016/j.procs.2015.04.005
      [52] Shoba, G. (2014). Water quality prediction using data mining techniques?: A survey. International Journal Of Engineering And Computer Science, 3(6), 6299-6306.
      [53] Solï¿ -Soler, J., Fiz, J. A., Morera, J., & Janï¿©, R. (2012). Multiclass classification of subjects with sleep apnoea-hypopnoea syndrome through snoring analysis. Medical Engineering & Physics, 34(9), 1213-1220. https://doi.org/10.1016/j.medengphy.2011.12.008
      [54] Song, Y., Cai, W., Huang, H., Zhou, Y., Wang, Y., & Feng, D. D. (2015). Locality-constrained Subcluster Representation Ensemble for lung image classification. Medical Image Analysis, 22(1), 102-113. https://doi.org/10.1016/j.media.2015.03.003
      [55] Srimani, P. K., & Koti, M. S. (2013). Medical diagnosis using ensemble classifiers - A novel machine-learning approach. Journal of Advanced Computing, 9-27. https://doi.org/10.7726/jac.2013.1002
      [56] Strzelecki, M., Materka, A., Drozdz, J., Krzeminska-Pakula, M., & Kasprzak, J. D. (2006). Classification and segmentation of intracardiac masses in cardiac tumor echocardiograms. Computerized Medical Imaging and Graphics, 30(2), 95-107. https://doi.org/10.1016/j.compmedimag.2005.11.004
      [57] Tao, W., Weihua, L., Haobin, S., & Zun, L. (2011). Software defect prediction based on classifiers ensemble. Journal of Information & Computational Science 8, 16, 4241-4254.
      [58] Tenario, J. M., Hummel, A. D., Cohrs, F. M., Sdepanian, V. L., Pisa, I. T., & de F?tima Marin, H. (2011). Artificial intelligence techniques applied to the development of a decision?support system for diagnosing celiac disease. International Journal of Medical Informatics, 80(11), 793-802. https://doi.org/10.1016/j.ijmedinf.2011.08.001
      [59] Ting, S. L., Ip, W. H., & Tsang, A. H. C. (2011). Is Na￯ve bayes a good classifier for document classification? International Journal of Software Engineering and Its Applications, 5, 37-46.
      [60] Voznika, F., & Viana, L. (2001). Data mining classification. Springer, 1-6.
      [61] Wiharto, W., Kusnanto, H. K., & Herianto, H. (2017). System diagnosis of coronary heart disease using a combination of dimensional reduction and data mining techniques?: A review. Indonesian Journal of Electrical Engineering and Computer Science, 7(2), 514-523. https://doi.org/10.11591/ijeecs.
      [62] Witten, I. H., & Frank, E. (2011). Data mining practical machine learning tools and techniques (Third Edit). Morgan Kaufmann.
      [63] Wu, W.-J., Lin, S.-W., & Moon, W. K. (2012). Combining support vector machine with genetic algorithm to classify ultrasound breast tumor images. Computerized Medical Imaging and Graphics, 36(8), 627-633. https://doi.org/10.1016/j.compmedimag.2012.07.004

  • Downloads

  • How to Cite

    Rosly, R., Makhtar, M., Khalid Awang, M., Isa Awang, M., Nordin Abdul Rahman, M., & Mahdin, H. (2018). Comprehensive study on ensemble classification for medical applications. International Journal of Engineering & Technology, 7(2.14), 186-190. https://doi.org/10.14419/ijet.v7i2.14.12822