Enhancing Housing Price Prediction Accuracy through Hybrid POI-XGBoost Models: A Case Study of Nanjing

Authors

  • Lei Sun

DOI:

https://doi.org/10.54691/v6hkjm84

Keywords:

Real Estate Price Prediction; XGBoost; POI (Point of Interest); Web Search Data; Machine Learning; Nanjing.

Abstract

 [Purpose] This study aims to improve the precision of second-hand housing price prediction by overcoming the limitations of traditional statistical data, such as poor timeliness and authenticity.[Method] Utilizing Python web crawlers, this paper collects housing data from Lianjia and processes neighborhood characteristics using Baidu POI (Points of Interest) data. Three models are constructed and compared: Vector Autoregression (VAR), Random Forest, and the proposed POI-XGBoost model. [Findings] The empirical results based on Nanjing

 [Purpose] This study aims to improve the precision of second-hand housing price prediction by overcoming the limitations of traditional statistical data, such as poor timeliness and authenticity.[Method] Utilizing Python web crawlers, this paper collects housing data from Lianjia and processes neighborhood characteristics using Baidu POI (Points of Interest) data. Three models are constructed and compared: Vector Autoregression (VAR), Random Forest, and the proposed POI-XGBoost model. [Findings] The empirical results based on Nanjing data demonstrate that the XGBoost model, enhanced with POI features, achieves superior predictive performance (of 0.952), significantly outperforming traditional linear and random forest models. [Originality] This research provides a novel methodological framework for integrating multi-source geospatial data with gradient boosting algorithms to capture the non-linear impact of neighborhood amenities on housing values.

data demonstrate that the XGBoost model, enhanced with POI features, achieves superior predictive performance (of 0.952), significantly outperforming traditional linear and random forest models. [Originality] This research provides a novel methodological framework for integrating multi-source geospatial data with gradient boosting algorithms to capture the non-linear impact of neighborhood amenities on housing values.

Downloads

Download data is not yet available.

References

[1] Zhang, R., & Li, S. (2023). Big Data Analytics in Real Estate: A Review. Journal of Property Research, 40(2), 145-167.

[2] Antipov, E. A., & Pokryshevskaya, E. B. (2012). Mass appraisal of residential apartments: An application of Random forest for valuation. Expert Systems with Applications, 39(2), 1772-1778.

[3] Chen, T., & Guestrin, C. (2016). XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD.

[4] Li, X., et al. (2023). Quantifying the Impact of Green Space on Housing Prices Using Machine Learning Interpreters. Sustainable Cities and Society, 89, 104-123.

[5] Zhang, Y., et al. (2024). Integrating Geospatial Data with Machine Learning for Urban House Price Prediction. Computers, Environment and Urban Systems, 103, 102-115.

[6] Wang, J., & Li, H. (2022). Forecasting Housing Price Volatility: A Hybrid LSTM-GARCH Approach. International Journal of Forecasting, 38(4), 1345-1360.

[7] Zhang, L., et al. (2024). Spatial Heterogeneity in Housing Markets: A GWR-ML Fusion Model. Land Use Policy, 136, 106-120.

[8] Goodman, A. C., & Thibodeau, T. G. (1998). Housing Market Segmentation. Urban Studies, 35(10), 1733-1745.

Downloads

Published

2026-03-31

Issue

Section

Articles

How to Cite

Sun, Lei. 2026. “Enhancing Housing Price Prediction Accuracy through Hybrid POI-XGBoost Models: A Case Study of Nanjing”. Scientific Journal of Economics and Management Research 8 (3): 117-20. https://doi.org/10.54691/v6hkjm84.