Prediction and Clustering of Bank Customer Churn Based on XGBoost and K-means
DOI:
https://doi.org/10.54691/bcpbm.v23i.1373Keywords:
Bank Customer Churn, XGBoost, K-meansAbstract
Due to the fierce competition of commercial banks, customers are becoming more and more important to banks. Therefore, customer churn has become a major problem that banks need to face. In this paper, XGboost algorithm was used on a data set of customers of a US bank from Kaggle to predict customer churn, and grid search method was used to find the best hyperparameters. Moreover, K-means algorithm is adopted to further subdivide the lost customers. For predicting customer churn, XGBoost algorithm achieves 0.84 in accuracy, 0.83 in precision, 0.84 in recall and 0.84 in F1 score on the test set. And the most important score for features in the case of the algorithm adopted are customers' estimated salary, credit score and balance. For the segmentation of churn customers, K-means algorithm divides these customers into 5 groups. These five groups of customers have different values for banks, so this paper puts forward corresponding recovery suggestions for their respective characteristics
Downloads
References
Roberts J H. Developing new rules for new markets [J]. Journal of the Academy of Marketing Science, 2000, 28(1):31.
Neslin S A, Gupta S, Kamakura W, et al. Defection Detection: Measuring and Understanding the Predictive Accuracy of Customer Churn Models[J]. Journal of marketing research, 2006, XLIII (2):p.204-211.
Zhao S X, Tai Q Y. Applied Research on Data Mining in Bank Customer Churn[J]. Applied Mechanics and Materials, 2014, 687-691:5023-5027.
Nie G, Rowe W, Zhang L, et al. Credit card churn forecasting by logistic regression and decision tree[J]. Expert Systems with Applications, 2011, 38(12):15273-15285.
Y Deng, Li D, Yang L, et al. Analysis and prediction of bank user churn based on ensemble learning algorithm[C]// 2021 IEEE International Conference on Power Electronics, Computer Applications (ICPECA). IEEE, 2021.
M Zhang. Bank credit card customer churn Prediction based on Logistic regression and XGBoost [D]. Shandong University.
Hruschka H, Natter M. Comparing performance of feedforward neural nets and K-means for cluster-based market segmentation [J]. European Journal of Operational Research, 1999, 114(2):346-353.
Syakur, M. A., et al. Integration k-means clustering method and elbow method for identification of the best customer profile cluster [C]. IOP conference series: materials science and engineering. Vol. 336. No. 1. IOP Publishing, 2018.
Y. Qiu, et al. Clustering Analysis for Silent Telecom Customers Based on K-means++ [C]. 2020 IEEE 4th Information Technology, Networking, Electronic and Automation Control Conference (ITNEC). Vol. 1. IEEE, 2020.
Hua H Y, Zhao H C. Application of Clustering Algorithms in Bank Customer Segmentation [J]. Computer Engineering, 2008, 34(24):37-39.
Bank customer churn prediction. https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction, 2022.






