Comparative Analysis of Machine Learning Techniques to Identify Churn for Telecom Data
Keywords:Churn prediction, Machine learning, Scala, Apache Spark, Big Data.
Big data analytics has been the focus for large scale data processing. Machine learning and Big data has great future in prediction. Churn prediction is one of the sub domain of big data. Preventing customer attrition especially in telecom is the advantage of churn prediction. Churn prediction is a day-to-day affair involving millions. So a solution to prevent customer attrition can save a lot. This paper propose to do comparison of three machine learning techniques Decision tree algorithm, Random Forest algorithm and Gradient Boosted tree algorithm using Apache Spark. Apache Spark is a data processing engine used in big data which provides in-memory processing so that the processing speed is higher. The analysis is made by extracting the features of the data set and training the model. Scala is a programming language that combines both object oriented and functional programming and so a powerful programming language. The analysis is implemented using Apache Spark and modelling is done using scala ML. The accuracy of Decision tree model came out as 86%, Random Forest model is 87% and Gradient Boosted tree is 85%.
 Ammar A.Q Ahmed, Maheswari D â€œChurn Prediction on Huge Telecom Data Using Hybrid Firefly- Particle Swarm Optimization Algorithm Based Classificationâ€ IOSR Journal of Computer Engineering (IOSR-JCE)e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 19, Issue 4, Ver. VII (Jul.-Aug. 2017), PP 30-39
 Amoo A. O, Akinyemi B. O, Awoyelu I. O, Adagunodo E. R, â€œModeling & Simulation of a Predictive Customer Churn Model for Telecommunication Industryâ€ Vol. 6, No. 11, November 2015 ISSN 2079-8407 Journal of Emerging Trends in Computing and Information Sciences.
 Yiqing Huang, Fangzhou Zhu, Mingxuan Yuan, Ke Deng, Yanhua Li, Bing Ni, Wenyuan Dai, Qiang Yang, Jia Zeng â€œTelco Churn Prediction with Big Dataâ€ SIGMODâ€™15,May31â€“June 4, 2015, Melbourne, Victoria, Australia.Copyright c 2015 ACM 978-1-4503-3469.
 Dr.M.Balasubramanian, Dr.M.Selvarani, â€œChurn Prediction In Telecom System Using Data Mining Techniquesâ€, International Journal of Scientific and Research Publications, Volume 4, Issue 4, April 2014, ISSN 2250-3153.
 Ajay Chandramouly, Ravindra Narkhede, Vijay Mungara, Guillermo Rueda, Asoka Diggs, â€œReducing Client Incidents through Big Data Predictive Analyticsâ€ Intel IT IT Best Practices Big Data Predictive Analytics December 2013.
 Theresa Morelli, Vivian Braun, David Pugh, Venky Rao, â€œRetain and Delight Your Customers by Applying IBM Predictive Customer Intelligenceâ€ Empowered Customers Drive Collaborative Business EvolutionÂ®, Forrester Research, Inc, May 2012.
 Anuj Sharma, Dr. Prabin Kumar Panigrahi, â€œA Neural Network based Approach for PredictingvCustomer Churn in Cellular Network Servicesâ€ International Journal of Computer Applications (0975 â€“ 8887)Volume 27â€“ No.11, August 2011.
 Rahul J.Jadav, Usharani T.Pawar, â€œChurn Prediction in Telecommunication Using Data Mining Technologyâ€, International Journal of Advanced Computer Science and Applications, Vol. 2, No.2, February 2011.
 Scott A. Neslin, Sunil Gupta, Wagner Kamakura, Junxiang Lu, and Charlotte H. Mason, â€œDefection Detection: Measuring and understanding the Predictive Accuracy of customer churn modelsâ€ Journal of Marketing ResearchVol. XLIII (May 2006),204211204Â© 2006, American Marketing Association.
 Junxiang Lu, â€œPredicting Customer Churn in the Telecommunications Industry â€“â€“ An Application of Survival Analysis Modeling Using SASâ€ SAS Institute Inc., 2001.
 P. Datta, B. Masand, D. Mani, and B. Li. Automated cellular modeling and prediction on a large scale. Artificial Intelligence Review, 14(6):485â€“502, 2000.
 M. Wegmuller, J. P. von der Weid, P. Oberson, and N. Gisin, â€œHigh resolution fiber distributed measurements with coherent OFDR,â€ in Proc. ECOCâ€™00, 2000, paper 11.3.4, p. 109.
 S. Zhang, C. Zhu, J. K. O. Sin, and P. K. T. Mok, â€œA novel ultrathin elevated channel low-temperature poly-Si TFT,â€ IEEE Electron Device Lett., vol. 20, pp. 569â€“571, Nov. 1999.
 S. M. Metev and V. P. Veiko, Laser Assisted Microtechnology, 2nd ed., R. M. Osgood, Jr., Ed. Berlin, Germany: Springer-Verlag, 1998.
 R. E. Sorace, V. S. Reinhardt, and S. A. Vaughn, â€œHigh-speed digital-to-RF converter,â€ U.S. Patent 5 668 842, Sept. 16, 1997.
View Full Article:
How to Cite
LicenseAuthors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under aÂ Creative Commons Attribution Licensethat allows others to share the work with an acknowledgement of the work''s authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal''s published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (SeeÂ The Effect of Open Access).