Customer Churn Prediction through Attribute Selection Analysis and Support Vector Machine

Main Article Content

Jia Yi Vivian Quek https://orcid.org/0009-0005-2688-1007
Ying Han Pang https://orcid.org/0000-0002-3781-6623
Zheng You Lim
Shih Yin Ooi https://orcid.org/0000-0002-3024-1011
Wee How Khoh

Keywords

Churn Prediction, Attribute Selection, Machine Learning, Filter Methods, Support Vector Machine

Abstract

An accurate customer churn prediction could alert businesses about potential churn customers so that proactive actions can be taken to retain the customers. Predicting churn may not be easy, especially with the increasing database sample size. Hence, attribute selection is vital in machine learning to comprehend complex attributes and identify essential variables. In this paper, a customer churn prediction model is proposed based on attribute selection analysis and Support Vector Machine. The proposed model improves churn prediction performance with reduced feature dimensions by identifying the most significant attributes of customer data. Firstly, exploratory data analysis and data preprocessing are performed to understand the data and preprocess it to improve the data quality. Next, two filter-based attribute selection techniques, i.e., Chi-squared and Analysis of Variance (ANOVA), are applied to the pre-processed data to select relevant features. Then, the selected features are input into a Support Vector Machine for classification. A real-world telecom database is used for model assessment. The empirical results demonstrate that ANOVA outperforms the Chi-squared filter in attribute selection. Furthermore, the results also show that, with merely ~50% of the features, feature selection based on ANOVA exhibits better performance compared to full feature set utilization.

Downloads

Download data is not yet available.
Abstract 328 | 777-PDF-v11n3pp180-194 Downloads 21

References

Albulayhi, K., Abu Al-Haija, Q., Alsuhibany, S. A., Jillepalli, A. A., Ashrafuzzaman, M., & Sheldon, F. T. (2022). IoT intrusion detection using machine learning with a novel high performing feature selection method. Applied Sciences, 12(10), 5015. https://doi.org/10.3390/app12105015
Cell2Cell. (2018). Telecom Churn (Cell2Cell). [Online]. Available at https://www.kaggle.com/jpacse/datasets-for-churn-telecom
Fujo, S. W., Subramanian, S., & Khder, M. A. (2022). Customer churn prediction in telecommunication industry using deep learning. Information Sciences Letters, 11(1), 24. http://dx.doi.org/10.18576/isl/110120
Jain, H., Khunteta, A., & Srivastava, S. (2022). Telecom Churn Prediction Using an Ensemble Approach with Feature Engineering and Importance. International Journal of Intelligent Systems and Applications in Engineering, 10(3), 22–33. https://ijisae.org/index.php/IJISAE/article/view/2134
Johny, C. P., & Mathai, P. P. (2017). Customer churn prediction: A survey. International Journal of Advanced Research in Computer Science, 8(5), 2178-2181. http://www.ijarcs.info/index.php/Ijarcs/article/view/4079
Lazaros, K., Tasoulis, S., Vrahatis, A., & Plagianakos, V. (2022). Feature Selection For High Dimensional Data Using Supervised Machine Learning Techniques. IEEE International Conference on Big Data (Big Data), 2022 (pp. 3891–3894). IEEE. https://doi.org/10.1109/BigData55660.2022.10020654
Mahmoodi, D., Soleimani, A., Khosravi, H., & Taghizadeh. M. (2011). FPGA Simulation of Linear and Nonlinear Support Vector Machine. Journal of Software Engineering and Applications, 5(4), 320–328. http://dx.doi.org/10.4236/jsea.2011.45036
Naing, Y. T., Raheem, M., & Batcha, N. K. (2022). Feature Selection for Customer Churn Prediction: A Review on the Methods & Techniques applied in the Telecom Industry. IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE), 2022 (pp. 1–5). IEEE. https://doi.org/10.1109/ICDCECE53908.2022.9793315
Seid, M. H., & Woldeyohannis, M. M. (2022). Customer Churn Prediction Using Machine Learning: Commercial Bank of Ethiopia. International Conference on Information and Communication Technology for Development for Africa (ICT4DA), 2022 (pp. 1–6). IEEE. https://doi.org/10.1109/ICT4DA56482.2022.9971224
Umayaparvathi, V., & Iyakutti, K. (2016). A Survey on Customer Churn Prediction in Telecom Industry: Datasets, Methods and Metrics. International Research Journal of Engineering and Technology, 3(4), 1065-1070. https://www.irjet.net/archives/V3/i4/IRJET-V3I4213.pdf
Vaidya, S., & Nigam, R. K. (2022). An Analysis of Customer Churn Predictions in the Telecommunications Sector. International Journal of Electronics Communication and Computer Engineering, 13(4), 37–43. https://www.ijecce.org/index.php/component/jresearch/?view=publication&task=show&id=1382&Itemid=437
Wang, Y., & Zhou, C. (2021). Feature selection method based on chi-square test and minimum redundancy. In Emerging Trends in Intelligent and Interactive Systems and Applications: Proceedings of the 5th International Conference on Intelligent, Interactive Systems and Applications (IISA2020) (pp. 171–178). Springer International Publishing. http://dx.doi.org/10.1007/978-3-030-63784-2_22
Wu, S., Yau, W. C., Ong, T. S., & Chong, S. C. (2021). Integrated churn prediction and customer segmentation framework for telco business. IEEE Access, 9, 62118–62136. https://doi.org/10.1109/ACCESS.2021.3073776
Zheng, K. (2022). Identifying Churning Employees: Machine Learning Algorithms from an Unbalanced Data Perspective. 5th International Conference on Machine Learning and Machine Intelligence, 2022 (pp. 14–22). https://doi.org/10.1145/3568199.3568202