Comparative Analysis of Machine Learning Techniques to Identify Churn for Telecom Data


  • M Malleswari
  • R.J Manira
  • Praveen Kumar
  • Murugan .





Churn prediction, Machine learning, Scala, Apache Spark, Big Data.


 Big data analytics has been the focus for large scale data processing. Machine learning and Big data has great future in prediction. Churn prediction is one of the sub domain of big data. Preventing customer attrition especially in telecom is the advantage of churn prediction.  Churn prediction is a day-to-day affair involving millions. So a solution to prevent customer attrition can save a lot. This paper propose to do comparison of three machine learning techniques Decision tree algorithm, Random Forest algorithm and Gradient Boosted tree algorithm using Apache Spark. Apache Spark is a data processing engine used in big data which provides in-memory processing so that the processing speed is higher. The analysis is made by extracting the features of the data set and training the model. Scala is a programming language that combines both object oriented and functional programming and so a powerful programming language. The analysis is implemented using Apache Spark and modelling is done using scala ML. The accuracy of Decision tree model came out as 86%, Random Forest model is 87% and Gradient Boosted tree is 85%.



[1] Ammar A.Q Ahmed, Maheswari D “Churn Prediction on Huge Telecom Data Using Hybrid Firefly- Particle Swarm Optimization Algorithm Based Classification†IOSR Journal of Computer Engineering (IOSR-JCE)e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 4, Ver. VII (Jul.-Aug. 2017), PP 30-39

[2] Amoo A. O, Akinyemi B. O, Awoyelu I. O, Adagunodo E. R, “Modeling & Simulation of a Predictive Customer Churn Model for Telecommunication Industry†Vol. 6, No. 11, November 2015 ISSN 2079-8407 Journal of Emerging Trends in Computing and Information Sciences.

[3] Yiqing Huang, Fangzhou Zhu, Mingxuan Yuan, Ke Deng, Yanhua Li, Bing Ni, Wenyuan Dai, Qiang Yang, Jia Zeng “Telco Churn Prediction with Big Data†SIGMOD’15,May31–June 4, 2015, Melbourne, Victoria, Australia.Copyright c 2015 ACM 978-1-4503-3469.

[4] Dr.M.Balasubramanian, Dr.M.Selvarani, “Churn Prediction In Telecom System Using Data Mining Techniquesâ€, International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014, ISSN 2250-3153.

[5] Ajay Chandramouly, Ravindra Narkhede, Vijay Mungara, Guillermo Rueda, Asoka Diggs, “Reducing Client Incidents through Big Data Predictive Analytics†Intel IT IT Best Practices Big Data Predictive Analytics December 2013.

[6] Theresa Morelli, Vivian Braun, David Pugh, Venky Rao, “Retain and Delight Your Customers by Applying IBM Predictive Customer Intelligence†Empowered Customers Drive Collaborative Business Evolution®, Forrester Research, Inc, May 2012.

[7] Anuj Sharma, Dr. Prabin Kumar Panigrahi, “A Neural Network based Approach for PredictingvCustomer Churn in Cellular Network Services†International Journal of Computer Applications (0975 – 8887)Volume 27– No.11, August 2011.

[8] Rahul J.Jadav, Usharani T.Pawar, “Churn Prediction in Telecommunication Using Data Mining Technologyâ€, International Journal of Advanced Computer Science and Applications, Vol. 2, No.2, February 2011.

[9] Scott A. Neslin, Sunil Gupta, Wagner Kamakura, Junxiang Lu, and Charlotte H. Mason, “Defection Detection: Measuring and understanding the Predictive Accuracy of customer churn models†Journal of Marketing ResearchVol. XLIII (May 2006),204211204© 2006, American Marketing Association.

[10] Junxiang Lu, “Predicting Customer Churn in the Telecommunications Industry –– An Application of Survival Analysis Modeling Using SAS†SAS Institute Inc., 2001.

[11] P. Datta, B. Masand, D. Mani, and B. Li. Automated cellular modeling and prediction on a large scale. Artificial Intelligence Review, 14(6):485–502, 2000.

[12] M. Wegmuller, J. P. von der Weid, P. Oberson, and N. Gisin, “High resolution fiber distributed measurements with coherent OFDR,†in Proc. ECOC’00, 2000, paper 11.3.4, p. 109.

[13] S. Zhang, C. Zhu, J. K. O. Sin, and P. K. T. Mok, “A novel ultrathin elevated channel low-temperature poly-Si TFT,†IEEE Electron Device Lett., vol. 20, pp. 569–571, Nov. 1999.

[14] S. M. Metev and V. P. Veiko, Laser Assisted Microtechnology, 2nd ed., R. M. Osgood, Jr., Ed. Berlin, Germany: Springer-Verlag, 1998.

[15] R. E. Sorace, V. S. Reinhardt, and S. A. Vaughn, “High-speed digital-to-RF converter,†U.S. Patent 5 668 842, Sept. 16, 1997.

View Full Article:

How to Cite

Malleswari, M., Manira, R., Kumar, P., & ., M. (2018). Comparative Analysis of Machine Learning Techniques to Identify Churn for Telecom Data. International Journal of Engineering & Technology, 7(3.34), 291–295.
Received 2018-09-07
Accepted 2018-09-07
Published 2018-09-01