A New Diversity Technique for Imbalance Learning Ensembles

Hartono .; Opim Salim Sitompul; Erna Budhiarti Nababan; Tulus .; Dahlan Abdullah; Ansari Saleh Ahmar

doi:10.14419/ijet.v7i2.11251

Authors and Affiliations

Hartono .
Opim Salim Sitompul
Erna Budhiarti Nababan
Tulus .
Dahlan Abdullah
Ansari Saleh Ahmar

About this article

DOI:

https://doi.org/10.14419/ijet.v7i2.11251

Received:

07-04-2018

Accepted:

07-04-2018

Published:

08-04-2018

Views:

449

Downloads:

193

Download PDF

Keywords:

Class Imbalance, Classifier Ensembles. Data Diversity, Hybrid Approach Redefinition

Abstract

Data mining and machine learning techniques designed to solve classification problems require balanced class distribution. However, in reality sometimes the classification of datasets indicates the existence of a class represented by a large number of instances whereas there are classes with far fewer instances. This problem is known as the class imbalance problem. Classifier Ensembles is a method often used in overcoming class imbalance problems. Data Diversity is one of the cornerstones of ensembles. An ideal ensemble system should have accurrate individual classifiers and if there is an error it is expected to occur on different objects or instances. This research will present the results of overview and experimental study using Hybrid Approach Redefinition (HAR) Method in handling class imbalance and at the same time expected to get better data diversity. This research will be conducted using 6 datasets with different imbalanced ratios and will be compared with SMOTEBoost which is one of the Re-Weighting method which is often used in handling class imbalance. This study shows that the data diversity is related to performance in the imbalance learning ensembles and the proposed methods can obtain better data diversity.

Â

References

[1] Chawla NV, Japkowicz N & Kolcz A (2004), Special Issue Learning Imbalanced Datasets. SGIKDD Explor. Newsl 6(1), 1-6

[2] Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H & Bing G (2017), Learning From Class-Imbalanced Data. Experts Systems with Application 73, 220-239

[3] Pastor J F D, Rodriguez J J, Osorio C I G & Kuncheva L I (2015), Diversity techniques improve the performance of the best imbalance learning ensembles. Information Sciences 325, 98-117

[4] Roy A, Cruz R M O, Sabourin M & Cavalcanti G D C (2018), A Study on combining Dynamic Selection and Data Preprocessing for Imbalance Learning. Neurocomputing

[5] Hartono, Sitompul O S, Tulus, Nababan E B (2018), Optimization Model of K-Means Clustering Using Artificial Neural Networks to Handle Class Imbalance Problem. IOP Conference Series: Materials Science and Engineering, 288, 012075.

[6] Galar M, Fernandez A, Barrenechea E & Bustince H (2012), A Review on Ensembles for the Class Imbalance Problem: Bagging, Boosting, and Hybrid-Based Approachs. IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews 42(4), 1-21

[7] Jian C, Gao J & Ao Y (2016), A New Sampling Method for Classifying Imbalanced Data Based on Support Vector Machine Ensemble. Neurocomputing 193, 115-122

[8] Kuncheva L I, Combining Pattern Classifiers, John Wiley & Sons, (2004), pp. 295-327

[9] Wang S & Yao X, "Diversity Analysis on Imbalanced Data Sets by Using Ensemble Models", Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, (2009)

[10] Sun Y, Kamel M S, Wong A K C & Wang Y (2007), Cost-Sensitive Boosting for Classification of Imbalanced Data. Pattern recognition 10, 3358-3378

[11] Yule G U (1900), On The Association of Attributes in Statistics. Philosophical Transactions of The Royal Society of London A194, 257-319

[12] Pastor J F D, Rodriguez J J, Osorio C I G, Kuncheva L I (2015), Random Balance: Ensembles of Variable Priors Classifiers for Imbalanced Data. Knowledge-Based Systems 85, 96-111

[13] Chawla N, Bowyer K, Hall L & Kegelmeyer P (2002), SMOTE: Synthetic Minority Oversampling Technique. Journal of Artificial Intelligence Research 16, 321-357

How to Cite

., H., Sitompul, O. S., Nababan, E. B., ., T., Abdullah, D., & Ahmar, A. S. (2018). A New Diversity Technique for Imbalance Learning Ensembles. International Journal of Engineering and Technology, 7(2), 478-483. https://doi.org/10.14419/ijet.v7i2.11251

Download Citation