Vol. 85 (2024)
Original Articles - Urban, Land, Environmental Appraisal and Economics

Comparing traditional and machine learning techniques in apartments mass appraisal in Fortaleza, Brazil

Antônio Augusto Ferreira de Oliveira
Municipal treasury auditor of Fortaleza, Brazil
Fabián Reyes-Bueno
Facultad de Ciencias Exactas y Naturales, Universidad Técnica Particular de Loja, Loja, Ecuador
Marco Aurelio Stumpf Gonzalez
Polytechnic School, Universidade do Vale do Rio dos Sinos, São Leopoldo, Brazil
Éverton da Silva
Geosciences Department, Universidade Federal de Santa Catarina, Florianópolis, Brazil

Published 2025-02-14

Keywords

  • semi-automatic assessment methods,
  • mass appraisal techniques,
  • machine learning

Abstract

Mass appraisal has significant applications, such as urban planning, real estate appraisal, and property tax. Due to the challenges of analyzing massive models, they are often developed using semi-automatic assessment methods and machine learning techniques. This article explores different appraisal model methods that utilize statistics and machine learning. It also looks at incorporating spatial information to see if the chosen method can effectively capture the typical spatial dependency of the real estate market. This can help reduce the spatial autocorrelation observed in the residuals. The study compared nine machine learning methods with traditional statistical approaches using a dataset of over 43,000 apartments in Fortaleza, Brazil. The results of the machine learning algorithms were similar. The XGBoost minimized spatial autocorrelation. The easiest interpretations were with MRA, M5P, and MARS techniques. Although, these techniques had the greatest residual spatial autocorrelations. There is a trade-off between the methods, depending on whether the aim is to improve accuracy or provide a clear explanation for property taxation.

References

  1. Al-Gawasmi, J. (2022). Machine Learning Applications in Real Estate: Critical Review of Recent Development. In Maglogiannis, I., Iliadis, L., Macintyre, J., & Cortez, P. (Eds). Artificial Intelligence Applications and Innovations. AIAI 2022. IFIP Advances in Information and Communication Technology, vol 647. Cham, Springer. https://link.springer.com/chapter/10.1007/978-3-031-08337-2_20.
  2. Alfaro-Navarro, J. L., Cano, E. L., Alfaro-Cortés, E., García, N., Gámez, M., & Larraz, B. (2020). A fully automated adjustment of ensemble methods in machine learning for modeling complex real estate systems. Complexity, 2020, 5287263, 1–12. https://doi.org/10.1155/2020/5287263. DOI: https://doi.org/10.1155/2020/5287263
  3. Anselin, L. (1988). Spatial econometrics: methods and models. Dordrecht, Springer Science Business Media. https://doi.org/10.1007/978-94-015-7799-1. DOI: https://doi.org/10.1007/978-94-015-7799-1
  4. Antipov, E. A., & Pokryshevskaya, E. B. (2012). Mass appraisal of residential apartments: an application of random forest for valuation and a cart-based approach for model diagnostics. Expert Systems with Applications, 39(2), 1772–1778. https://doi.org/10.1016/j.eswa.2011.08.077. DOI: https://doi.org/10.1016/j.eswa.2011.08.077
  5. Arraiz I., Drukker D., Kelejian H., & Prucha I. (2010). A spatial Cliff-Ord-type model with heteroskedastic innovations: small and large sample results. Journal of Regional Science, 50(2), 592–614. https://doi.org/10.1111/J.1467-9787.2009.00618.X. DOI: https://doi.org/10.1111/j.1467-9787.2009.00618.x
  6. Baur, K., Rosenfelder, M., & Lutz, B. (2023). Automated real estate valuation with machine learning models using property descriptions. Expert Systems with Applications, 213(C), 119147. https://doi.org/10.1016/j.eswa.2022.119147. DOI: https://doi.org/10.1016/j.eswa.2022.119147
  7. Belmiro, C., Neto, R. D. M. S., Barros, A., & Ospina, R. (2023). Understanding the land use intensity of residential buildings in Brazil: An ensemble machine learning approach. Habitat International, 139, 102896. https://doi.org/10.1016/j.habitatint.2023.102896. DOI: https://doi.org/10.1016/j.habitatint.2023.102896
  8. Bilgilioğlu, S. S., & Yılmaz, H. M. (2023). Comparison of different machine learning models for mass appraisal of real estate. Survey Review, 55(388), 32–43. https://doi.org/10.1080/00396265.2021.1996799 DOI: https://doi.org/10.1080/00396265.2021.1996799
  9. Brazil (1988). Constitution of the Federative Republic of Brazil. Available at: https://legis.senado.leg.br/norma/579494/publicacao/33296461 (accessed 26 June 2024).
  10. Brazilian Institute of Geography and Statistics - IBGE (2022). Panorama Censo 2022. Available at: https://censo2022.ibge.gov.br/panorama/ (accessed 26 June 2024).
  11. Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324. DOI: https://doi.org/10.1023/A:1010933404324
  12. Can, A. (1992). Specification and estimation of hedonic housing price models. Regional Science and Urban Economics, 22(3), 453–474. https://doi.org/10.1016/0166-0462(92)90039-4. DOI: https://doi.org/10.1016/0166-0462(92)90039-4
  13. Chen, Y., Jiao, J., & Farahi, A. (2023). Disparities in affecting factors of housing price: A machine learning approach to the effects of housing status, public transit, and density factors on single-family housing price. Cities, 140, 104432. https://doi.org/10.1016/j.cities.2023.104432 DOI: https://doi.org/10.1016/j.cities.2023.104432
  14. De Cesare, C. M., Silva, E., & Silva, L. R. (2023). Avaliação de Imóveis. In Silva, E. (Ed.). Cadastro territorial multifinalitário aplicado à gestão municipal, 1ª ed., Cap. 5, pp. 101–126. Florianópolis, Universidade Federal de Santa Catarina. Available at: https://geografia.blog.br/pdf/2023ctmagmufsc.pdf#page=99 (accessed 26 June 2024).
  15. Doan, Q. C., Chen, C., He, S., & Zhang, X. (2024). How urban air quality affects land values: Exploring non-linear and threshold mechanism using explainable artificial intelligence. Journal of Cleaner Production, 434, 140340. https://doi.org/10.1016/j.jclepro.2023.140340 DOI: https://doi.org/10.1016/j.jclepro.2023.140340
  16. Dubin, R.A. (1992). Spatial autocorrelation and neighborhood quality. Regional Science and Urban Economics, 22(3): 433–452. DOI: https://doi.org/10.1016/0166-0462(92)90038-3
  17. Eguino, H., & Erba, D. (2024). Mercado inmobiliario e impuesto predial: aplicaciones de técnicas de valuación masiva. Washington, DC, Banco Interamericano de Desarrollo. http://dx.doi.org/10.18235/0005488 DOI: https://doi.org/10.18235/0005488
  18. Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67. https://www.jstor.org/stable/2241837. DOI: https://doi.org/10.1214/aos/1176347963
  19. Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232. https://www.jstor.org/stable/2699986. DOI: https://doi.org/10.1214/aos/1013203451
  20. Friedman, J. H., & Roosen, C. B. (1995). An introduction to multivariate adaptive regression splines. Statistical Methods in Medical Research, 4(3), 197–217. https://doi.org/10.1177/096228029500400303. DOI: https://doi.org/10.1177/096228029500400303
  21. Ganaie, M. A., Hu, M., Malik, A. K., Tanveer, M., & Suganthan, P. N. (2022). Ensemble deep learning: a review. Engineering Applications of Artificial Intelligence, 115, 105151. https://doi.org/10.1016/j.engappai.2022.105151 DOI: https://doi.org/10.1016/j.engappai.2022.105151
  22. Geerts, M., & De Weerdt, J. (2023). A survey of methods and input data types for house price prediction. ISPRS International Journal of Geo-Information, 12(5), 200. https://doi.org/10.3390/ijgi12050200 DOI: https://doi.org/10.3390/ijgi12050200
  23. Griffith, D. A. (1996). Spatial Autocorrelation and Eigenfunctions of the Geographic Weights Matrix Accompanying Georeferenced Data. The Canadian Geographer / Le Géographe Canadien, 40(4), 351–367. https://doi.org/10.1111/j.1541-0064.1996.tb00445.x DOI: https://doi.org/10.1111/j.1541-0064.1996.tb00462.x
  24. Gunes, T. (2023). Model agnostic interpretable machine learning for residential property valuation. Survey Review, 1–16. https://doi.org/10.1080/00396265.2023.2293366 DOI: https://doi.org/10.1080/00396265.2023.2293366
  25. Hengl, T., Heuvelink, G. B., & Rossiter, D. G. (2007). About regression-kriging: from equations to case studies. Computers Geosciences, 33(10): 1301–1315. https://doi.org/10.1016/j.cageo.2007.05.001. DOI: https://doi.org/10.1016/j.cageo.2007.05.001
  26. Heyman, A. V., Law, S., & Berghauser P. M. (2018). How is Location Measured in Housing Valuation? A Systematic Review of Accessibility Specifications in Hedonic Price Models. Urban Science, 3(1), 3. https://doi.org/10.3390/urbansci3010003. DOI: https://doi.org/10.3390/urbansci3010003
  27. Ho, W. K., Tang, B. S., & Wong, S. W. (2021). Predicting property prices with machine learning algorithms. Journal of Property Research, 38(1), 48–70. https://doi.org/10.1080/09599916.2020.1832558. DOI: https://doi.org/10.1080/09599916.2020.1832558
  28. Hong, J., Choi, H., & Kim, W. S. (2020). A house price valuation based on the random forest approach: the mass appraisal of residential property in South Korea. International Journal of Strategic Property Management, 24(3), 140–152. https://doi.org/10.3846/ijspm.2020.11544. DOI: https://doi.org/10.3846/ijspm.2020.11544
  29. Hu, L., Chun, Y., & Griffith, D. A. (2022). Incorporating spatial autocorrelation into house sale price prediction using random forest model. Transactions in GIS, 26(5), 2123–2144. https://doi.org/10.1111/tgis.12931. DOI: https://doi.org/10.1111/tgis.12931
  30. Hurley, A.K. & Sweeney, J. (2024). Irish property price estimation using a flexible geo-spatial smoothing approach: what is the impact of an address? Journal of Real Estate Finance and Economics, 68, 355–393. https://doi.org/10.1007/s11146-022-09888-y. DOI: https://doi.org/10.1007/s11146-022-09888-y
  31. Iban, M. C. (2022) An explainable model for the mass appraisal of residences: the application of tree-based Machine Learning algorithms and interpretation of value determinants, Habitat International, 128, 102660. https://doi.org/10.1016/j.habitatint.2022.102660. DOI: https://doi.org/10.1016/j.habitatint.2022.102660
  32. International Association of Assessing Officers (2013). IAAO Glossary for Property Appraisal and Assessment, 2nd ed. Available at: https://www.iaao.org/media/Glossary_Ed2_Web/IAAO_GLOSSARY_2015.pdf (accessed 26 June 2024).
  33. Jabeur, S. B., Gharib, C., Mefteh-Wali, S., & Arfi, B. W. (2021). CatBoost model and artificial intelligence techniques for corporate failure prediction. Technological Forecasting and Social Change, 166, 120658. https://doi.org/10.1016/j.techfore.2021.120658. DOI: https://doi.org/10.1016/j.techfore.2021.120658
  34. Jayantha, W. M., & Oladinrin, O. T. (2020). Bibliometric analysis of hedonic price model using Citespace. International Journal of Housing Markets and Analysis, 13(2), 357–371. https://doi.org/10.1108/ijhma-04-2019-0044. DOI: https://doi.org/10.1108/IJHMA-04-2019-0044
  35. Kayakuş, M., Terzioğlu, M., & Yetiz, F. (2022). Forecasting housing prices in Turkey by machine learning methods. Aestimum, 80, 33–44. https://doi.org/10.36253/aestim-12320. DOI: https://doi.org/10.36253/aestim-12320
  36. Oliveira, A. A. F. (2020). Avaliação em massa com modelos de aprendizado de máquina aplicados aos terrenos urbanos do Município de Fortaleza (Mass evaluation with machine learning models applied to urban land in the Municipality of Fortaleza). Master’s Thesis in Public Sector Economics. Federal University of Ceará, Fortaleza, Brazil. Available at: http://www.repositorio.ufc.br/handle/riufc/53263 (accessed 26 June 2024).
  37. Park, B., & Bae, J. K. (2015). Using machine learning algorithms for housing price prediction: the case of Fairfax County, Virginia housing data. Expert Systems with Applications, 42(6), 2928–2934. https://doi.org/10.1016/j.eswa.2014.11.040. DOI: https://doi.org/10.1016/j.eswa.2014.11.040
  38. Renigier-Biłozor, M., Źróbek, S., Walacik, M., Borst, R., Grover, R., & D’Amato, M. (2022). International acceptance of automated modern tools uses must-have for sustainable real estate market development. Land Use Policy, 113, 105876. https://doi.org/10.1016/j.landusepol.2021.105876. DOI: https://doi.org/10.1016/j.landusepol.2021.105876
  39. Rey-Blanco, D., Zofío, J. L., & González-Arias, J. (2024). Improving hedonic housing price models by integrating optimal accessibility indices into regression and random forest analyses. Expert Systems with Applications, 235, 121059. https://doi.org/10.1016/j.eswa.2023.121059 DOI: https://doi.org/10.1016/j.eswa.2023.121059
  40. Reyes-Bueno, F., García-Samaniego, J. M., & Sánchez-Rodríguez, A. (2018). Large-scale simultaneous market segment definition and mass appraisal using decision tree learning for fiscal purposes. Land Use Policy, 79, 116–122. https://doi.org/10.1016/j.landusepol.2018.08.012 DOI: https://doi.org/10.1016/j.landusepol.2018.08.012
  41. Rico-Juan, J. R., & La Paz, P. (2021). Machine learning with explainability or spatial hedonics tools? An analysis of the asking prices in the housing market in Alicante, Spain. Expert Systems with Applications, 171, 114590. https://doi.org/10.1016/j.eswa.2021.114590. DOI: https://doi.org/10.1016/j.eswa.2021.114590
  42. Rosen, S. (1974). Hedonic prices and implicit markets: product differentiation in pure competition. Journal of Political Economy, 82(1), 34–55. https://www.jstor.org/stable/1830899. DOI: https://doi.org/10.1086/260169
  43. Sheppard, S. (1999). Hedonic analysis of housing markets, Chapter 41, 1595–1635. In Cheshire, P., and Mills, E. S. (Eds). Handbook of Regional and Urban Economics, 3. https://doi.org/10.1016/S1574-0080(99)80010-8. DOI: https://doi.org/10.1016/S1574-0080(99)80010-8
  44. Tobler, W. R. (1979). Cellular geography. In Gale, S., & Olsson, G. (Eds.). Philosophy in geography. Dordrecht, D Reidel, pp. 519–536. DOI: https://doi.org/10.1007/978-94-009-9394-5_18
  45. Waang, D., & Li, V. J. (2019). Mass appraisal models of real estate in the 21st century: a systematic literature review. Sustainability, 11(24), 7006. https://doi.org/10.3390/su11247006. DOI: https://doi.org/10.3390/su11247006
  46. Wang, D., Wang, P., Wang, C., & Wang, P. (2022). Calibrating probabilistic predictions of quantile regression forests with conformal predictive systems. Pattern Recognition Letters, 156, 81–87. https://doi.org/10.1016/j.patrec.2022.02.003. DOI: https://doi.org/10.1016/j.patrec.2022.02.003
  47. Yağmur, A., Kayakuş, M., & Terzioğlu, M. (2022). House price prediction modeling using machine learning techniques: a comparative study. Aestimum, 81, 39–51. https://doi.org/10.36253/aestim-13703. DOI: https://doi.org/10.36253/aestim-13703
  48. Zurada, J., Levitan, A., & Guan, J. (2011). A comparison of regression and artificial intelligence methods in a mass appraisal context. Journal of Real Estate Research, 33(3), 349–388. https://doi.org/10.1080/10835547.2011.12091311. DOI: https://doi.org/10.1080/10835547.2011.12091311