Research on the House Price Forecast Based on machine learning algorithm
DOI:
https://doi.org/10.54691/bcpbm.v32i.2881Keywords:
House Price; Forecast; Machine Learning; Regression; Hyperparameter Optimization.Abstract
House price experiences some fluctuations every year, due to some potential factors such as location, area, facilities and so on. Housing price prediction is a significant topic of real estate, and it is beneficial for buyers to make strategy decisions about house dealing. There are many research on house price forecast, yet the current research cannot comprehensively compare and analyze the popular house price prediction approach. Constructing a model begins with pre-processing data to fill null values or remove data outliers and the categorical attribute can be shifted into required attributes by using one hot encoder methodology. This paper used the following five algorithms decision tree, random forest regression, Adaptive Boosting (AdaBoost), Gradient Boosting Decision Tree (GBDT), and extreme gradient boosting (XGBoost) this paper utilized to predict house prices and compared according to the root mean squared error. This paper found GBDT and XGBoost have more accurate prediction results compared with other algorithms. Besides, this paper found which features most affect the price of a house. In real-world applications, machine learning based housing price prediction models are utilized by banks and financial institutions to obtain better house price assessment, risk analysis and lending decisions.
Downloads
References
Chen, Yihao, Runtian Xue, and Yu Zhang. "House price prediction based on machine learning and deep learning methods." 2021 International Conference on Electronic Information Engineering and Computer Science (EIECS). IEEE, 2021.
Shah, Vatsal H. "Machine learning techniques for stock prediction." Foundations of Machine Learning| Spring 1.1 (2007): 6-12.
T. D. Phan, "Housing Price Prediction Using Machine Learning Algorithms: The Case of Melbourne City, Australia," 2018 International Conference on Machine Learning and Data Engineering (iCMLDE), 2018, pp. 35-42, doi: 10.1109/iCMLDE.2018.00017.
Z. Yaping and Z. Changyin, "Gene Feature Selection Method Based on ReliefF and Pearson Correlation," 2021 3rd International Conference on Applied Machine Learning (ICAML), 2021, pp. 15-19, doi: 10.1109/ICAML54311.2021.00011.
Berrar, D. (2018). Cross-validation. Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics, 1–3(January), 542–545.
Snoek, Jasper, Hugo Larochelle, and Ryan P. Adams. "Practical bayesian optimization of machine learning algorithms." Advances in neural information processing systems 25 (2012).
Klein, Aaron, et al. "Fast bayesian optimization of machine learning hyperparameters on large datasets." Artificial intelligence and statistics. PMLR, 2017.
Yoo, K., Yoo, H., Lee, J.M. et al. Classification and Regression Tree Approach for Prediction of Potential Hazards of Urban Airborne Bacteria during Asian Dust Events. Sci Rep 8, 11823 (2018). https://doi.org/10.1038/s41598-018-29796-7
De Aquino Afonso, B. K., Melo, L. C., de Oliveira, W. D. G., Da Silva Sousa, S. B., & Berton, L., (2020). Housing prices prediction with a deep learning and random forest ensemble [Unpublished manuscript]. Anais do Encontro Nacional de Inteligencia Artificial e Computacion.
Winky K.O. Ho, Bo-Sin Tang & Siu Wai Wong (2021) Predicting property prices with machine learning algorithms, Journal of Property Research, 38:1, 48-70, DOI: 10.1080/09599916.2020.1832558
Schapire, Robert E. "Explaining adaboost." Empirical inference. Springer, Berlin, Heidelberg, 2013. 37-52.
D. Yu, Z. Wang and W. Wei, "House Price Prediction Based on a Machine Learning Model," 2021 2nd International Seminar on Artificial Intelligence, Networking and Information Technology (AINIT), 2021, pp. 391-395, doi: 10.1109/AINIT54228.2021.00082.
DMLC. xgboost. GitHub. https://github.com/dmlc/xgboost (accessed June 26, 2022).