Comparative analysis of machine learning models for bond default forecasting based on financial data of Chinese listed companies

Peng Liu; Yuanhang Li

doi:10.54691/bcpbm.v34i.3153

Authors

Peng Liu
Yuanhang Li

DOI:

https://doi.org/10.54691/bcpbm.v34i.3153

Keywords:

Bond Default Forecasting, Machine Learning, Chinese Listed Companies Financial Data.

Abstract

At a time when bond defaults become frequent and market confidence is undermined, the subject of how to accurately predict bond defaults merits investigation. This paper combined popular machine learning methods to predict bond defaults by selecting financial data of 23 listed companies in China that defaulted on credit bonds from 2013 to 2022 and 230 financial data of listed companies that did not default on credit bonds during the same period, using logistic regression, random forest, support vector machine (SVM), and K Nearest Neighbors (KNN) to estimate the probability of bond defaults by listed companies and compare the predictive performance of these methods. The financial data are found to be quite good at predicting bond defaults of listed companies, and the SVM model performs the best.

Downloads

Download data is not yet available.

References

DAI Yarong & SHEN Yifeng. (2022). Can the Random Forest Model Predict Boond Default in China?. China Journal of Econometrics (02),418-440.

CHEN Ziyang. (2020). A study of financial irregularities in listed companies based on decision trees and random forest algorithms. Contemporary Finance (07),23-25+16.

Fernandez, A., Garcia, S., Herrera, F., & Chawla, N. (2018). SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. Journal Of Artificial Intelligence Research, 61, 863-905. doi: 10.1613/jair.1.11192

B. Tang and H. He, "ENN: Extended Nearest Neighbor Method for Pattern Recognition [Research Frontier]," in IEEE Computational Intelligence Magazine, vol. 10, no. 3, pp. 52-60, Aug. 2015, doi: 10.1109/MCI.2015.2437512.

Guan, H., Zhang, Y., Xian, M., Cheng, H., & Tang, X. (2020). SMOTE-WENN: Solving class imbalance and small sample problems by oversampling and distance scaling. Applied Intelligence, 51(3), 1394-1409. doi: 10.1007/s10489-020-01852-8

H. He and E. A. Garcia, "Learning from Imbalanced Data," in IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263-1284, Sept. 2009, doi: 10.1109/TKDE.2008.239.

J. Wang, M. Xu, H. Wang and J. Zhang, "Classification of Imbalanced Data by Using the SMOTE Algorithm and Locally Linear Embedding," 2006 8th international Conference on Signal Processing, 2006, pp. , doi: 10.1109/ICOSP.2006.345752.

Li, P., Zhou, R., & Xiong, Y. (2020). Can ESG Performance Affect Bond Default Rate? Evidence from China. Sustainability, 12(7), 2954. doi: 10.3390/su12072954

Zhu, L., Qiu, D., Ergu, D., Ying, C., & Liu, K. (2019). A study on predicting loan default based on the random forest algorithm. Procedia Computer Science, 162, 503-513. doi: 10.1016/j.procs.2019.12.017

Liang, D., Lu, C., Tsai, C., & Shih, G. (2016). Financial ratios and corporate governance indicators in bankruptcy prediction: A comprehensive study. European Journal Of Operational Research, 252(2), 561-572. doi: 10.1016/j.ejor.2016.01.012