A Two-Stage ARIMA Model via Machine Learning and its Application in Stock Price Prediction

Fenglin Zhang; Long Chen; Jiwen Yu

doi:10.54691/bcpbm.v26i.1989

Authors

Fenglin Zhang
Long Chen
Jiwen Yu

DOI:

https://doi.org/10.54691/bcpbm.v26i.1989

Keywords:

ARIMA; Machine Learning; Combined prediction; Stock Price Prediction.

Abstract

Stock price prediction has always been a hot issue in the financial sector and quantitative investment. Since stock price time series data tends to have linear and nonlinear features, traditional ARIMA models exhibit certain limitations in modeling such data. Based on this, this paper innovatively uses intraday transaction data of the stock market as auxiliary information, and proposes an improved ARIMA stock price prediction model based on machine learning methods. The specific principle is to use the ARIMA model to predict the linear information of the data, and machine learning-related algorithms (RF, XGBoost, LSTM) are used to predict the nonlinear residual information. The empirical results show that compared with the traditional ARIMA model, the model can effectively improve the prediction accuracy and is robust in stock price prediction. Finally, because this framework is very flexible in content, it can be equipped with machine learning methods with the best prediction accuracy for different practical application scenarios. In addition, we can use the model averaging method in the two-stage framework to improve the accuracy, and the mixed or high-frequency data can be further mined.

Downloads

Download data is not yet available.

References

Wu, Y., & Wen, X. (2016). Short-term stock price forecast based on ARIMA model. Statistics and decision-making (23),83-86.

Chang, Y., Feng. Y., & Cao, X. (2018). Stock Analysis and Forecast Based on Nonlinear Time Series Model. Mathematical Practice and Understanding (22), 21-26.

Franses, P. H., & Ghijsels, H. (1999). Additive outliers, GARCH and prediction volatility. International Journal of prediction, 15(1), 1-9.

Lei, L. (2018). Wavelet neural network prediction method of stock price trend based on rough set attribute reduction. Applied Soft Computing, 62, 923-932.

Yang, X., & Huang, X. (2010). Research on stock price prediction based on support vector machine. Computer Simulation (09), 302-305.

Chen, K., Zhou, Y., & Dai, F. (2015, October). A LSTM-based method for stock returns prediction: A case study of China stock market. In 2015 IEEE international conference on big data (big data) (pp. 2823-2824). IEEE.

Box, G. E., Jenkins, G. M., Reinsel, G. C., & Ljung, G. M. (2015). Time series analysis: prediction and control. John Wiley & Sons.

Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.

Andersson, J. O. (2011). The new foundations of evolution: on the tree of life.

Hendrikx, J., Murphy, M., & Onslow, T. (2014). Classification trees as a tool for operational avalanche prediction on the Seward Highway, Alaska. Cold Regions Science and Technology, 97, 113-120.

Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.

Li, C., Zhang, X., Qaosar, M., Ahmed, S., Alam, K. M. R., & Morimoto, Y. (2019, August). Multi-factor based stock price prediction using hybrid neural networks with attention mechanism. In 2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech) (pp. 961-966). IEEE.

Lu Sunyun. (2013). Research on multi-objective allocation of water resources in the Hanjiang River Basin based on section control (PhD dissertation, Wuhan University).

Zhang, X. & Zou, G. (2011). Model averaging method and its application in prediction. Statistical Research (06), 97-102.