Predicting Customer Churn in Telecom Sector based on Penalization Techniques and Ensemble Machine Learning

Asia Mahdi Naser; Eman al-shamery

doi:10.14419/ijet.v7i4.19.27977

Authors

Asia Mahdi Naser
Eman al-shamery

Received date: February 26, 2019

Accepted date: February 26, 2019

Published date: November 27, 2018

DOI:

https://doi.org/10.14419/ijet.v7i4.19.27977

Keywords:

Customer Churn Prediction, Random Forests, Ensemble Machine Learning, Weighted accuracy and diversity, Telecommunication Industry, Boosting, Penalization Method, Regularization Techniques.

Abstract

Customer Churn Prediction model (CCP)aims to detect customers with a high propensity to leave. The target of this research is to handle a large scale Telecommunication Company and identify potential churn. In the proposed research, Predictive Mean Matching (PMM) algorithm used to handle missing values, instead of removing features or observations with high missing data.
First Ensemble Machine learning classifieris offered to investigate and compare the combining of an Ensemble learner based on Generalized Linear Model (GLM) and the prediction values based on tree model using a Random Forest classifier. The suggested CCP model employed the Weighted Accuracy and Diversity (WAD) as an algorithm to find the optimal weights for the proposed Ensemble classifier.
The second Ensemble learner based on the generalized linear model is incorporated of penalized methods (Ridge, Lasso,and ElasticNet) with a Logistic Regression method on the binomial family. Randomly generate values between [0, 1] became the weights for this classifier. The Weights are selected according to the principle that weights of highervalue are assigned for great performance classifier to ensure the highestaccuracy of Churn Prediction model. 10-fold, based on five times repeated Cross-Validation (CV) performance technique used to enable efficient and automatic search for the optimal value of lambda Î» parameter for penalization methods.
The two Ensemble classifiers incorporated within a customer churn prediction model to handle a large scale dataset, time-dependent features label, and an imbalance data distribution in the Telecommunication industry.
Experimental results show an increasein predictive performance. In addition, the results depicted that using of ensemble learning has brought a significant improvement for individual base learners in terms of performance indicators such as Area under Curve (AUC), sensitivity, specificity, Accuracy, and Mean Square Error(MSE), Accuracy is the best candidates for churn prediction tasks.
Â
Â

References

[1] J. Donald, â€œPredicting Attrition in Financial Data with Machine Learning Algorithms,â€ 2018.
[2] M. K. Sahu, R. Pandey, and S. Silakari, â€œISSN NO : 0076-5131 Analysis of Customer Churn Prediction in Telecom Sector Using Logistic Regression and Decision Tree Keywords :â€ J. Appl. Sci. Comput., vol. 5, no. 6, pp. 62â€“67, 2018.
[3] P. K. Nyambane, â€œCHURN PREDICTION IN TELECOMMUNICATION INDUSTRY IN KENYA USING DECISION TREE,â€ 2017.
[4] G. C. Esteves, â€œChurn Prediction in the Telecom Business,â€ p. 96, 2016.
[5] W. Verbeke, â€œProfit-driven data mining in massive customer networks: new insights and algorithms,â€ no. 379, 2012.
[6] G. Vink, L. E. Frank, J. Pannekoek, and S. van Buuren, â€œPredictive mean matching imputation of semicontinuous variables,â€ Stat. Neerl., vol. 68, no. 1, pp. 61â€“90, 2014.
[7] H. Abbasimehr, M. Sestak, and M. J. Tarokh, â€œA comparative assessment of the performance of ensemble learning in customer churn prediction.,â€ Int. Arab J. Inf. Technol., vol. 11, no. 6, pp. 599â€“606, 2014.
[8] G. Louppe, â€œUnderstanding Random Forests: From Theory to Practice,â€ 2014.
[9] K. Bailey, J. Miller, and Valerie Santiago-Gonzalez, â€œpredicting diabetes diagnosis in African Americans using Ensemble machine learning.â€
[10] I. Stephen Nabareseh, â€œPredictive analytics: a data mining technique in customer churn management for decision making PrediktivnÃ analytika: technika data miningu pro rozhodovÃ¡nÃ s vyuÅ¾itÃm v Å™ÃzenÃ odchodu zÃ¡kaznÃkÅ¯,â€ no. February, 2017.
[11] F. Andreis, â€œShrinkage methods (ridge, lasso, elastic nets),â€ no. November 2017.
[12] J. VorlÃÄkovÃ¡, â€œLeast Absolute Shrinkage and Selection Operator Method,â€ 2017.
[13] A. Agarwal, G. Verma, H. B. Sri, K. Mannem, and F. Hamid, â€œIndian Institute of Technology, Kanpur Department of Industrial and Management Engineering IME672A Data Mining and Knowledge Discovery Course Project Report,â€ 2016.
[14] G. Vink, G. Laserdisc, and S. Van Buuren, â€œPartitioned predictive mean matching as a multilevel imputation technique,â€ Psychol. Test Assess. Model. vol. 5, no. 4, pp. 1â€“16, 2015.
[15] P. Allison, â€œImputation by Predictive Mean Matching: Promise & Peril,â€ http://statisticalhorizons.com/, 2015. [Online]. Available: http://statisticalhorizons.com/predictive-mean-matching.
[16] A. J. van der Koij, â€œRegularization with ridge penalties, the lasso, and the elastic net for regression with optimal scaling transformations,â€ Predict. Accuracy Stab. Regrets. With Optim. Scaling Transform. no. 2006, pp. 65â€“90, 2007.
[17] J. Lanford, T. Nykodym, A. Rao, and A. Wang, Generalized Linear Modeling with H2Oâ€™s R Package. 2015.
[18] S. Dardouri and R. Bouallegue, â€œPerformance Analysis of Regularized Linear Regression Models For Oxazolines and Oxazoles Derivatives Descriptor Dataset,â€ vol. 1, no. 4, pp. 111â€“123, 2013.
[19] D. Dalpiaz, R for Statistical Learning. 2017.
[20] Art Owen, â€œRegularization : Ridge Regression and the LASSO the Bias-Variance Tradeoff,â€ 2007.
[21] E. Krona, â€œA simulation study of model fitting to high dimensional data using penalized logistic regression Mathematica institution,â€ Stockholm University.
[22] D. S. De Groot, â€œChurn prediction in telecommunication Classification problem,â€ 2017.
[23] A. Lemmens and C. Croux, â€œBagging and Boosting Classification Trees to Predict Churn,â€ J. Mark. Res., vol. 43, no. 2, pp. 276â€“286, 2006.
[24] J. Van Haver, â€œBenchmarking analytical techniques for churn modeling in a B2B context,â€ 2016.
[25] C. Zhang and Yunqian Ma Editors, Ensemble Machine Learning. 2012.
[26] J. Vijaya and E. Sivasankar, â€œComputing efficient features using rough set theory combined with ensemble classification techniques to improve the customer churn prediction in the telecommunication sector,â€ Computing, vol. 100, no. 8, pp. 839â€“860, 2018.
[27] X. Zeng, D. F. Wong, and L. S. Chao, â€œConstructing better classifier ensemble based on weighted accuracy and diversity measure,â€ Sci. World J., vol. 2014, 2014.
[28] M. Ewing, â€œTeknisk-naturvetenskaplig fakultet UTH-enhetenâ€, 2012.

Predicting Customer Churn in Telecom Sector based on Penalization Techniques and Ensemble Machine Learning

Authors

Asia Mahdi Naser

Eman al-shamery

How to Cite

DOI:

Keywords:

Abstract

References

Downloads

How to Cite