Comparing traditional and machine learning techniques in apartments mass appraisal in Fortaleza, Brazil
Published 2025-02-14
Keywords
- semi-automatic assessment methods,
- mass appraisal techniques,
- machine learning
Copyright (c) 2024 Augusto, Fabián, MARCO AURELIO STUMPF GONZALEZ, Éverton

This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
Mass appraisal has significant applications, such as urban planning, real estate appraisal, and property tax. Due to the challenges of analyzing massive models, they are often developed using semi-automatic assessment methods and machine learning techniques. This article explores different appraisal model methods that utilize statistics and machine learning. It also looks at incorporating spatial information to see if the chosen method can effectively capture the typical spatial dependency of the real estate market. This can help reduce the spatial autocorrelation observed in the residuals. The study compared nine machine learning methods with traditional statistical approaches using a dataset of over 43,000 apartments in Fortaleza, Brazil. The results of the machine learning algorithms were similar. The XGBoost minimized spatial autocorrelation. The easiest interpretations were with MRA, M5P, and MARS techniques. Although, these techniques had the greatest residual spatial autocorrelations. There is a trade-off between the methods, depending on whether the aim is to improve accuracy or provide a clear explanation for property taxation.
References
- Al-Gawasmi, J. (2022). Machine Learning Applications in Real Estate: Critical Review of Recent Development. In Maglogiannis, I., Iliadis, L., Macintyre, J., & Cortez, P. (Eds). Artificial Intelligence Applications and Innovations. AIAI 2022. IFIP Advances in Information and Communication Technology, vol 647. Cham, Springer. https://link.springer.com/chapter/10.1007/978-3-031-08337-2_20.
- Alfaro-Navarro, J. L., Cano, E. L., Alfaro-Cortés, E., García, N., Gámez, M., & Larraz, B. (2020). A fully automated adjustment of ensemble methods in machine learning for modeling complex real estate systems. Complexity, 2020, 5287263, 1–12. https://doi.org/10.1155/2020/5287263.
- Anselin, L. (1988). Spatial econometrics: methods and models. Dordrecht, Springer Science Business Media. https://doi.org/10.1007/978-94-015-7799-1.
- Antipov, E. A., & Pokryshevskaya, E. B. (2012). Mass appraisal of residential apartments: an application of random forest for valuation and a cart-based approach for model diagnostics. Expert Systems with Applications, 39(2), 1772–1778. https://doi.org/10.1016/j.eswa.2011.08.077.
- Arraiz I., Drukker D., Kelejian H., & Prucha I. (2010). A spatial Cliff-Ord-type model with heteroskedastic innovations: small and large sample results. Journal of Regional Science, 50(2), 592–614. https://doi.org/10.1111/J.1467-9787.2009.00618.X.
- Baur, K., Rosenfelder, M., & Lutz, B. (2023). Automated real estate valuation with machine learning models using property descriptions. Expert Systems with Applications, 213(C), 119147. https://doi.org/10.1016/j.eswa.2022.119147.
- Belmiro, C., Neto, R. D. M. S., Barros, A., & Ospina, R. (2023). Understanding the land use intensity of residential buildings in Brazil: An ensemble machine learning approach. Habitat International, 139, 102896. https://doi.org/10.1016/j.habitatint.2023.102896.
- Bilgilioğlu, S. S., & Yılmaz, H. M. (2023). Comparison of different machine learning models for mass appraisal of real estate. Survey Review, 55(388), 32–43. https://doi.org/10.1080/00396265.2021.1996799
- Brazil (1988). Constitution of the Federative Republic of Brazil. Available at: https://legis.senado.leg.br/norma/579494/publicacao/33296461 (accessed 26 June 2024).
- Brazilian Institute of Geography and Statistics - IBGE (2022). Panorama Censo 2022. Available at: https://censo2022.ibge.gov.br/panorama/ (accessed 26 June 2024).
- Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
- Can, A. (1992). Specification and estimation of hedonic housing price models. Regional Science and Urban Economics, 22(3), 453–474. https://doi.org/10.1016/0166-0462(92)90039-4.
- Chen, Y., Jiao, J., & Farahi, A. (2023). Disparities in affecting factors of housing price: A machine learning approach to the effects of housing status, public transit, and density factors on single-family housing price. Cities, 140, 104432. https://doi.org/10.1016/j.cities.2023.104432
- De Cesare, C. M., Silva, E., & Silva, L. R. (2023). Avaliação de Imóveis. In Silva, E. (Ed.). Cadastro territorial multifinalitário aplicado à gestão municipal, 1ª ed., Cap. 5, pp. 101–126. Florianópolis, Universidade Federal de Santa Catarina. Available at: https://geografia.blog.br/pdf/2023ctmagmufsc.pdf#page=99 (accessed 26 June 2024).
- Doan, Q. C., Chen, C., He, S., & Zhang, X. (2024). How urban air quality affects land values: Exploring non-linear and threshold mechanism using explainable artificial intelligence. Journal of Cleaner Production, 434, 140340. https://doi.org/10.1016/j.jclepro.2023.140340
- Dubin, R.A. (1992). Spatial autocorrelation and neighborhood quality. Regional Science and Urban Economics, 22(3): 433–452.
- Eguino, H., & Erba, D. (2024). Mercado inmobiliario e impuesto predial: aplicaciones de técnicas de valuación masiva. Washington, DC, Banco Interamericano de Desarrollo. http://dx.doi.org/10.18235/0005488
- Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67. https://www.jstor.org/stable/2241837.
- Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. https://www.jstor.org/stable/2699986.
- Friedman, J. H., & Roosen, C. B. (1995). An introduction to multivariate adaptive regression splines. Statistical Methods in Medical Research, 4(3), 197–217. https://doi.org/10.1177/096228029500400303.
- Ganaie, M. A., Hu, M., Malik, A. K., Tanveer, M., & Suganthan, P. N. (2022). Ensemble deep learning: a review. Engineering Applications of Artificial Intelligence, 115, 105151. https://doi.org/10.1016/j.engappai.2022.105151
- Geerts, M., & De Weerdt, J. (2023). A survey of methods and input data types for house price prediction. ISPRS International Journal of Geo-Information, 12(5), 200. https://doi.org/10.3390/ijgi12050200
- Griffith, D. A. (1996). Spatial Autocorrelation and Eigenfunctions of the Geographic Weights Matrix Accompanying Georeferenced Data. The Canadian Geographer / Le Géographe Canadien, 40(4), 351–367. https://doi.org/10.1111/j.1541-0064.1996.tb00445.x
- Gunes, T. (2023). Model agnostic interpretable machine learning for residential property valuation. Survey Review, 1–16. https://doi.org/10.1080/00396265.2023.2293366
- Hengl, T., Heuvelink, G. B., & Rossiter, D. G. (2007). About regression-kriging: from equations to case studies. Computers Geosciences, 33(10): 1301–1315. https://doi.org/10.1016/j.cageo.2007.05.001.
- Heyman, A. V., Law, S., & Berghauser P. M. (2018). How is Location Measured in Housing Valuation? A Systematic Review of Accessibility Specifications in Hedonic Price Models. Urban Science, 3(1), 3. https://doi.org/10.3390/urbansci3010003.
- Ho, W. K., Tang, B. S., & Wong, S. W. (2021). Predicting property prices with machine learning algorithms. Journal of Property Research, 38(1), 48–70. https://doi.org/10.1080/09599916.2020.1832558.
- Hong, J., Choi, H., & Kim, W. S. (2020). A house price valuation based on the random forest approach: the mass appraisal of residential property in South Korea. International Journal of Strategic Property Management, 24(3), 140–152. https://doi.org/10.3846/ijspm.2020.11544.
- Hu, L., Chun, Y., & Griffith, D. A. (2022). Incorporating spatial autocorrelation into house sale price prediction using random forest model. Transactions in GIS, 26(5), 2123–2144. https://doi.org/10.1111/tgis.12931.
- Hurley, A.K. & Sweeney, J. (2024). Irish property price estimation using a flexible geo-spatial smoothing approach: what is the impact of an address? Journal of Real Estate Finance and Economics, 68, 355–393. https://doi.org/10.1007/s11146-022-09888-y.
- Iban, M. C. (2022) An explainable model for the mass appraisal of residences: the application of tree-based Machine Learning algorithms and interpretation of value determinants, Habitat International, 128, 102660. https://doi.org/10.1016/j.habitatint.2022.102660.
- International Association of Assessing Officers (2013). IAAO Glossary for Property Appraisal and Assessment, 2nd ed. Available at: https://www.iaao.org/media/Glossary_Ed2_Web/IAAO_GLOSSARY_2015.pdf (accessed 26 June 2024).
- Jabeur, S. B., Gharib, C., Mefteh-Wali, S., & Arfi, B. W. (2021). CatBoost model and artificial intelligence techniques for corporate failure prediction. Technological Forecasting and Social Change, 166, 120658. https://doi.org/10.1016/j.techfore.2021.120658.
- Jayantha, W. M., & Oladinrin, O. T. (2020). Bibliometric analysis of hedonic price model using Citespace. International Journal of Housing Markets and Analysis, 13(2), 357–371. https://doi.org/10.1108/ijhma-04-2019-0044.
- Kayakuş, M., Terzioğlu, M., & Yetiz, F. (2022). Forecasting housing prices in Turkey by machine learning methods. Aestimum, 80, 33–44. https://doi.org/10.36253/aestim-12320.
- Oliveira, A. A. F. (2020). Avaliação em massa com modelos de aprendizado de máquina aplicados aos terrenos urbanos do Município de Fortaleza (Mass evaluation with machine learning models applied to urban land in the Municipality of Fortaleza). Master’s Thesis in Public Sector Economics. Federal University of Ceará, Fortaleza, Brazil. Available at: http://www.repositorio.ufc.br/handle/riufc/53263 (accessed 26 June 2024).
- Park, B., & Bae, J. K. (2015). Using machine learning algorithms for housing price prediction: the case of Fairfax County, Virginia housing data. Expert Systems with Applications, 42(6), 2928–2934. https://doi.org/10.1016/j.eswa.2014.11.040.
- Renigier-Biłozor, M., Źróbek, S., Walacik, M., Borst, R., Grover, R., & D’Amato, M. (2022). International acceptance of automated modern tools uses must-have for sustainable real estate market development. Land Use Policy, 113, 105876. https://doi.org/10.1016/j.landusepol.2021.105876.
- Rey-Blanco, D., Zofío, J. L., & González-Arias, J. (2024). Improving hedonic housing price models by integrating optimal accessibility indices into regression and random forest analyses. Expert Systems with Applications, 235, 121059. https://doi.org/10.1016/j.eswa.2023.121059
- Reyes-Bueno, F., García-Samaniego, J. M., & Sánchez-Rodríguez, A. (2018). Large-scale simultaneous market segment definition and mass appraisal using decision tree learning for fiscal purposes. Land Use Policy, 79, 116–122. https://doi.org/10.1016/j.landusepol.2018.08.012
- Rico-Juan, J. R., & La Paz, P. (2021). Machine learning with explainability or spatial hedonics tools? An analysis of the asking prices in the housing market in Alicante, Spain. Expert Systems with Applications, 171, 114590. https://doi.org/10.1016/j.eswa.2021.114590.
- Rosen, S. (1974). Hedonic prices and implicit markets: product differentiation in pure competition. Journal of Political Economy, 82(1), 34–55. https://www.jstor.org/stable/1830899.
- Sheppard, S. (1999). Hedonic analysis of housing markets, Chapter 41, 1595–1635. In Cheshire, P., and Mills, E. S. (Eds). Handbook of Regional and Urban Economics, 3. https://doi.org/10.1016/S1574-0080(99)80010-8.
- Tobler, W. R. (1979). Cellular geography. In Gale, S., & Olsson, G. (Eds.). Philosophy in geography. Dordrecht, D Reidel, pp. 519–536.
- Waang, D., & Li, V. J. (2019). Mass appraisal models of real estate in the 21st century: a systematic literature review. Sustainability, 11(24), 7006. https://doi.org/10.3390/su11247006.
- Wang, D., Wang, P., Wang, C., & Wang, P. (2022). Calibrating probabilistic predictions of quantile regression forests with conformal predictive systems. Pattern Recognition Letters, 156, 81–87. https://doi.org/10.1016/j.patrec.2022.02.003.
- Yağmur, A., Kayakuş, M., & Terzioğlu, M. (2022). House price prediction modeling using machine learning techniques: a comparative study. Aestimum, 81, 39–51. https://doi.org/10.36253/aestim-13703.
- Zurada, J., Levitan, A., & Guan, J. (2011). A comparison of regression and artificial intelligence methods in a mass appraisal context. Journal of Real Estate Research, 33(3), 349–388. https://doi.org/10.1080/10835547.2011.12091311.