A Hybrid Data Mining Method for Customer Churn Prediction


  • E. Jamalian Faculty of Technology and Information, Department of Information Technology, University of Qom, Qom, Iran
  • R. Foukerdi Faculty of Management, Department of Industrial Management, University of Qom, Qom, Iran


The expenses for attracting new customers are much higher compared to the ones needed to maintain old customers due to the increasing competition and business saturation. So customer retention is one of the leading factors in companies’ marketing. Customer retention requires a churn management, and an effective management requires an exact and effective model for churn prediction. A variety of techniques and methodologies have been used for churn prediction, such as logistic regression, neural networks, genetic algorithm, decision tree etc.. In this article, a hybrid method is presented that predicts customers churn more accurately, using data fusion and feature extraction techniques. After data preparation and feature selection, two algorithms, LOLIMOT and C5.0, were trained with different size of features and performed on test data. Then the outputs of the individual classifiers were combined with weighted voting. The results of applying this method on real data of a telecommunication company proved the effectiveness of the method.


customer churn, data mining, hybrid method, LOLIMOT, C5.0, weighted voting


Download data is not yet available.


E. Ko, S. H. Kim, M. Kim, J. Y. Woo, “Organizational characteristics and the CRM adoption process”, Journal of Business Research, Vol. 61, No. 1, pp. 65–74, 2008 DOI: https://doi.org/10.1016/j.jbusres.2006.05.011

S. Gupta, D. Hanssens, B. Hardie, W. Kahn, V. Kumar, N. Lin, N. Ravishanker, S. Sriram, “Modeling customer lifetime value”, Journal of Service Research, Vol. 9, No. 2, pp. 139–155, 2006 DOI: https://doi.org/10.1177/1094670506293810

G. Nie, W. Rowe, L. Zhang, Y. Tian, Y. Shi, “Credit card churn forecasting by logistic regression and decision tree”, Expert Systems with Applications, Vol. 38, No. 12, pp. 15273–15285, 2011 DOI: https://doi.org/10.1016/j.eswa.2011.06.028

J. Lu, “Predicting customer churn in the telecommunications industry––An application of survival analysis modeling using SAS”, SAS User Group International (SUGI27) Online Proceedings, pp. 114–27, 2002

S. A. Neslin, S. Gupta, W. Kamakura, J. Lu, C. Mason, Defection detection: Improving predictive accuracy of customer churn models, Tuck School of Business, Dartmouth College, 2004

N. Glady, B. Baesens, C. Croux, “Modeling churn using customer lifetime value”, European Journal of Operational Research, Vol. 197, No. 1, pp. 402–411, 2009 DOI: https://doi.org/10.1016/j.ejor.2008.06.027

D. Van den Poel, B. Larivière, “Customer attrition analysis for financial services using proportional hazard models”, European Journal of Operational Research, Vol. 157, No. 1, pp. 196–217, 2004 DOI: https://doi.org/10.1016/S0377-2217(03)00069-9

J. Hadden, A. Tiwari, R. Roy, D. Ruta, “Computer assisted customer churn management: State-of-the-art and future trends”, Computers & Operations Research, Vol. 34, No. 10, pp. 2902–2917, 2007 DOI: https://doi.org/10.1016/j.cor.2005.11.007

A. Ghorbani, F. Taghiyareh, C. Lucas, “The application of the locally linear model tree on customer churn prediction”, International Conference of Soft Computing and Pattern Recognition, SOCPAR’09, Malacca, Malaysia, pp. 472–477, December 4-7, 2009 DOI: https://doi.org/10.1109/SoCPaR.2009.97

J. Burez, D. Van den Poel, “CRM at a pay-TV company: Using analytical models to reduce customer attrition by targeted marketing for subscription services”, Expert Systems with Applications, Vol. 32, No. 2, pp. 277–288, 2007 DOI: https://doi.org/10.1016/j.eswa.2005.11.037

J. Han, M. Kamber, J. Pei, Data Mining: Concepts and Techniques. Morgan Kaufmann, 2006

P. C. Pendharkar, “Genetic algorithm based neural network approaches for predicting churn in cellular wireless network services”, Expert Systems with Applications, Vol. 36, No. 3, pp. 6714–6720, 2009 DOI: https://doi.org/10.1016/j.eswa.2008.08.050

P. Datta, B. Masand, D. R. Mani, B. Li, “Automated cellular modeling and prediction on a large scale”, Artificial Intelligence Review, Vol. 14, No. 6, pp. 485–502, 2000 DOI: https://doi.org/10.1023/A:1006643109702

C.-P. Wei, I. Chiu, “Turning telecommunications call details to churn prediction: a data mining approach”, Expert Systems with Applications, Vol. 23, No. 2, pp. 103–112, 2002 DOI: https://doi.org/10.1016/S0957-4174(02)00030-1

W.-H. Au, K. C. Chan, X. Yao, “A novel evolutionary data mining algorithm with applications to churn prediction”, IEEE Transactions on Evolutionary Computation, Vol. 7, No. 6, pp. 532–545, 2003 DOI: https://doi.org/10.1109/TEVC.2003.819264

H. Hwang, T. Jung, E. Suh, “An LTV model and customer segmentation based on customer value: a case study on the wireless telecommunication industry”, Expert Systems with Applications, Vol. 26, No. 2, pp. 181–188, 2004 DOI: https://doi.org/10.1016/S0957-4174(03)00133-7

B. Larivière, D. Van den Poel, “Predicting customer retention and profitability by using random forests and regression forests techniques”, Expert Systems with Applications, Vol. 29, No. 2, pp. 472–484, 2005 DOI: https://doi.org/10.1016/j.eswa.2005.04.043

S.-Y. Hung, D. C. Yen, H.-Y. Wang, “Applying data mining to telecom churn management”, Expert Systems with Applications, Vol. 31, Vo. 3, pp. 515–524, 2006 DOI: https://doi.org/10.1016/j.eswa.2005.09.080

D. Anil Kumar, V. Ravi, “Predicting credit card customer churn in banks using data mining”, International Journal of Data Analysis Techniques and Strategies, Vol. 1, No. 1, pp. 4–28, 2008 DOI: https://doi.org/10.1504/IJDATS.2008.020020

Y. Xie, X. Li, E. W. T. Ngai, W. Ying, “Customer churn prediction using improved balanced random forests”, Expert Systems with Applications, Vol. 36, No. 3, pp. 5445–5449, 2009 DOI: https://doi.org/10.1016/j.eswa.2008.06.121

H. Cho, Y. Lee, H. Lee, “Toward Optimal Churn Management: A Partial Least Square (PLS) Model”, Proceedings of the Sixteenth Americas Conference on Information Systems, Lima, Peru, AMCIS, August 12-15, 2010

X. Yu, S. Guo, J. Guo, X. Huang, “An extended support vector machine forecasting framework for customer churn in e-commerce”, Expert Systems with Applications, Vol. 38, No. 3, pp. 1425–1430, 2011 DOI: https://doi.org/10.1016/j.eswa.2010.07.049

Z.-Y. Chen, Z.-P. Fan, M. Sun, “A hierarchical multiple kernel support vector machine for customer churn prediction using longitudinal behavioral data”, European Journal of Operational Research, Vol. 223, No. 2, pp. 461-472, 2012 DOI: https://doi.org/10.1016/j.ejor.2012.06.040

K. Coussement, K. W. De Bock, “Customer churn prediction in the online gambling industry: The beneficial effect of ensemble learning”, Journal of Business Research, Vol. 66, No. 9, pp. 1629-1636, 2013 DOI: https://doi.org/10.1016/j.jbusres.2012.12.008

K. W. De Bock, K. Coussement, and D. Van den Poel, “Ensemble classification based on generalized additive models”, Computational Statistics & Data Analysis, Vol. 54, No. 6, pp. 1535–1546, 2010 DOI: https://doi.org/10.1016/j.csda.2009.12.013

J. Basiri, F. Taghiyareh, B. Moshiri, “A Hybrid Approach to Predict Churn”, 2010 IEEE Asia-Pacific Services Computing Conference (APSCC), pp. 485–491, Hangzhou, China, December 6-10, 2010 DOI: https://doi.org/10.1109/APSCC.2010.87

Y. Freund, R. E. Schapire, “Experiments with a new boosting algorithm”, Proceedings of the Thirteenth International Conference on International Conference on Machine Learning, Bari, Italy, July 3-6, 1996

M. Mirmomeni, C. Lucas, B. N. Araabi, M. Shafiee, “Forecasting sunspot numbers with the aid of fuzzy descriptor models”, Space Weather, Vol. 5, No. 8, 2007 DOI: https://doi.org/10.1029/2006SW000289

M. Siami, M. R. Gholamian, J. Basiri, “An application of locally linear model tree algorithm with combination of feature selection in credit scoring”, International Journal of Systems Science, Vol. 45, No. 10, pp. 2213-2222, 2014 DOI: https://doi.org/10.1080/00207721.2013.767395

A. Lemmens, C. Croux, Bagging and boosting classification trees to predict churn, Journal of Marketing Research, Vol. 43, No. 2, pp. 276–286, 2006 DOI: https://doi.org/10.1509/jmkr.43.2.276

I. Jolliffe, Principal Component Analysis, Wiley Online Library, 2005 DOI: https://doi.org/10.1002/0470013192.bsa501

S. Nazari-Shirkouhi, A. Keramati, “Modeling customer satisfaction with new product design using a flexible fuzzy regression-data envelopment analysis algorithm”, Applied Mathematical Modelling, Vol. 50, pp. 755-71, 2017 DOI: https://doi.org/10.1016/j.apm.2017.01.020

S. Nazari-Shirkouhi, A. Keramati, K. Rezaie, “Improvement of customers’ satisfaction with new product design using an adaptive neuro-fuzzy inference systems approach”, Neural Computing and Applications, Vol. 23, No. 1, pp. 333-43, 2013 DOI: https://doi.org/10.1007/s00521-013-1431-x

V. Majazi Dalfard, M. Nazari Asli, S. Nazari-Shirkouhi, S. M. Sajadi, S. M. Asadzadeh, “Incorporating the effects of hike in energy prices intoenergy consumption forecasting: A fuzzy expert system”, Neural Computing and Applications, Vol. 23, No. 1, pp. 153-69, 2013 DOI: https://doi.org/10.1007/s00521-012-1282-x

A. Azadeh, S. M. Asadzadeh, R. Jafari-Marandi, S. Nazari-Shirkouhi, G. Baharian Khoshkhou, S. Talebi, A. Naghavi, “Optimum estimation of missing values in randomized complete block design by genetic algorithm”, Knowledge-Based Systems, Vol. 37, pp. 37-47, 2013 DOI: https://doi.org/10.1016/j.knosys.2012.06.014

S. Nazari-Shirkouhi, H. Eivazy, R. Ghodsi, K. Rezaie, E. Atashpaz-Gargari, “Solving the integrated product mix-outsourcing problem using the imperialist competitive algorithm”, Expert Systems with Applications, Vol. 37, No. 12, pp. 7615-7626, 2010 DOI: https://doi.org/10.1016/j.eswa.2010.04.081


How to Cite

E. Jamalian and R. Foukerdi, “A Hybrid Data Mining Method for Customer Churn Prediction”, Eng. Technol. Appl. Sci. Res., vol. 8, no. 3, pp. 2991–2997, Jun. 2018.


Abstract Views: 2011
PDF Downloads: 756

Metrics Information