Evaluating the Performance of Supervised Classification  Models: Decision Tree and NaÃ¯ve Bayes Using KNIME

Syed Muzamil Basha; Dharmendra Singh Rajput; Ravi Kumar Poluru; S. Bharath Bhushan; Shaik Abdul Khalandar Basha

doi:10.14419/ijet.v7i4.5.20079

Authors

Syed Muzamil Basha
Dharmendra Singh Rajput
Ravi Kumar Poluru
S. Bharath Bhushan
Shaik Abdul Khalandar Basha

Received date: September 22, 2018

Accepted date: September 22, 2018

Published date: September 22, 2018

DOI:

https://doi.org/10.14419/ijet.v7i4.5.20079

Keywords:

Classification Accuracy, Decision Tree, Error Rate, F-measure, KNIME Analytics platform, NaÃ¯ve Bayes, Precision, Recall.

Abstract

The classification task is to predict the value of the target variableÂ from the values of the input variables.Â If a target is provided as part of the dataset, then classification is a supervised task. It is important to analysis the performance of supervised classification models before using them in classification task. In our research we would like to propose a novel way to evaluated the performance of supervised Â Â Â Â classification models like Decision Tree and NaÃ¯ve Bayes using KNIME Analytics platform. Experiments are conducted on Multi variant dataset consisting 58000 instances, 9 columns associated specially for classification, collected from UCI Machine learning repositoriesÂ (http://archive.ics.uci.edu/ml/datasets/statlog+(shuttle)) and compared the performance of both the models in terms of Classification Â Accuracy (CA) and Error Rate. Finally, validated both the models using Metric precision, recall and F-measure. In our finding, we found thatÂ Decision tree acquires CA (99.465%) where as NaÃ¯ve Bayes attain CA (90.358%). The F-measure of Decision tree is 0.984, whereas NaÃ¯ve Bayes acquire 0.7045.
Â
Â

References

[1] C. E. LÃ³pez GuarÃn, E. L. GuzmÃ¡n and F. A. GonzÃ¡lez,"A Model to Predict Low Academic Performance at a Specific Enrollment Using Data Mining", IEEE Revista Iberoamericana de Tecnologias del Aprendizaje, Vol.10, No.3, (2015), pp.119-125.
[2] Wei Chen, Xiaoshen Xie, Jiale Wang, Biswajeet Pradhan, Haoyuan Hong, Dieu Tien Bui, Zhao Duan, Jianquan Ma,"A comparative study of logistic model tree, random forest, and classification and regression tree models for spatial prediction of landslide susceptibility", CATENA, Vol.151, ( 2017), pp.147-160.
[3] Zhao Zhang, Lei Wang, Lei Jia, Fanzhang Li, Li Zhang, Mingbo Zhao,"Projective label propagation by label embedding: A deep label prediction framework for representation and classification", Knowledge-Based Systems, Vol.119, (2017), pp.94-112.
[4] Zhengxing Huang, Tak-Ming Chan, Wei Dong,"MACE prediction of acute coronary syndrome via boosted resampling classification using electronic medical records", Journal of Biomedical Informatics, Vol. 66, (2017), pp.161-170.
[5] Saeed Banihashemi, Grace Ding, Jack Wang,"Developing a Hybrid Model of Prediction and Classification Algorithms for Building Energy Consumption", Energy Procedia, Vol.110, (2017), pp.371-376.
[6] Mohammad Hossein Rafiei, Hojjat Adeli,"NEEWS: A novel earthquake early warning model using neural dynamic classification and neural dynamic optimization", Soil Dynamics and Earthquake Engineering, Vol.100, (2017), pp.417-427.
[7] Diego P.P. Mesquita, Lincoln S. Rocha, JoÃ£o Paulo P. Gomes, Ajalmar R. Rocha Neto,"Classification with reject option for software defect prediction", Applied Soft Computing, Vol.49, (2016), pp.1085-1093.
[8] Zeyu Wang, Ravi S. Srinivasan,"A review of artificial intelligence based building energy use prediction: Contrasting the capabilities of single and ensemble prediction models", Renewable and Sustainable Energy Reviews, Vol.75, (2017), pp.796-808.
[9] Mazin Abed Mohammed, Mohd Khanapi AbdGhani, Raed Ibraheem Hamed, Dheyaa Ahmed Ibrahim,"Review on Nasopharyngeal Carcinoma: Concepts, methods of analysis, segmentation, classification, prediction and impact: A review of the research literature", Journal of Computational Science, (2017).
[10] Ghulam Mujtaba, Liyana Shuib, Ram Gopal Raj, Retnagowri Rajandram, Khairunisa Shaikh,"Prediction of cause of death from forensic autopsy reports using text classification techniques: A comparative study", Journal of Forensic and Legal Medicine, (2017).
[11] Miha Pavlinek, Vili Podgorelec,"Text classification method based on self-training and LDA topic models", Expert Systems with Applications, Vol.80, (2017), pp.83-93s.
[12] Tinghui Ouyang, Xiaoming Zha, Liang Qin,"A combined multivariate model for wind power prediction", Energy Conversion and Management, Vol.144, (2017), pp.361-373.
[13] Goran MauÅ¡a, Tihana Galinac Grbac,"Co-evolutionary multi-population genetic programming for classification in software defect prediction: An empirical case study", Applied Soft Computing, Vol. 55, (2017), pp.331-351.
[14] Basha, Syed Muzamil, Yang Zhenning, Dharmendra Singh Rajput, N. Iyengar, and D. R. Caytiles,"Weighted Fuzzy Rule Based Sentiment Prediction Analysis on Tweets", International Journal of Grid and Distributed Computing, Vol.10,No.6, (2017), pp.41-54, DOI: 10.14257/ijgdc.2017.10.6.04.
[15] Basha, Syed Muzamil, Yang Zhenning, Dharmendra Singh Rajput, Ronnie D. Caytiles, and N. Ch SN Iyengar,"Comparative Study on Performance Analysis of Time Series Predictive Models", International Journal of Grid and Distributed Computing, Vol.10,No.8, (2017), pp.37-48, DOI: 10.14257/ijgdc.2017.10.8.04.
[16] Basha, Syed Muzamil, H. Balaji, N. Ch SN Iyengar, and Ronnie D. Caytiles,"A Soft Computing Approach to Provide Recommendation on PIMA Diabetes", International Journal of Advanced Science and Technology, Vol.106, (2017), pp.19-32, DOI: 10.14257/ijast.2017.106.03.
[17] Basha, Syed Muzamil, Dharmendra Singh Rajput, and Vishnu Vandhan,"Impact of Gradient Ascent and Boosting Algorithm in Classification", International Journal of Intelligent Engineering and Systems (IJIES), Vol.11,No.1, (2018), pp.41-49. DOI: 10.22266/ijies2018.0228.05.
[18] Poluru, Ravi Kumar, and Shaik Naseera,"A Literature Review on Routing Strategy in the Internet of Things", Journal of Engineering Science and Technology Review, Vol.10,No.5, (2017), pp.50-60, DOI:10.25103/jestr.105.06.
[19] Bhushan, S. Bharath, and Pradeep Reddy,"A Four-Level Linear Discriminant Analysis Based Service Selection in The Cloud Environment", International Journal of Technology, Vol. 5, (2016), pp. 859-870.
[20] Bhushan, S. Bharath, and Reddy CH Pradeep,"A Network QoS Aware Service Ranking Using Hybrid AHP-PROMETHEE Method in Multi-Cloud Domain", International Journal of Engineering Research in Africa, Vol. 24, (2016).
[21] Gitanjali J,"Data mining from smart card data using data clustering", International Journal of Applied Engineering Research, Vol.11,No.1, (2016), pp.347-52.

Evaluating the Performance of Supervised Classification Models: Decision Tree and NaÃ¯ve Bayes Using KNIME

Authors

Syed Muzamil Basha

Dharmendra Singh Rajput

Ravi Kumar Poluru

S. Bharath Bhushan

Shaik Abdul Khalandar Basha

How to Cite

DOI:

Keywords:

Abstract

References

Downloads

How to Cite