Interpretable Machine Learning for the German residential rental market – shedding light into model mechanics
Published 2025-08-07
Keywords
- interpretable machine learning,
- SHAP,
- real estate
Copyright (c) 2024 Severin Bachmann

This work is licensed under a Creative Commons Attribution 4.0 International License.
Abstract
We compare the drivers in Machine learning models and give insights into their strengths and weaknesses predicting rental prices. The study employs SHAP values to measure feature importance. The study aims to investigate linear regression, decision tree and XGBoost algorithms. The research is unique in its application of IML methods to a large dataset of over 2.4 million observations in the German rental market and its application of comparative statistics using aggregate SHAP values. Main results are the superiority of XGB and LR showing higher SHAP values overall and thus explaining its lower predictive efficacy. DT models capture intricate interactions among variables with fewer features, while XGB accommodates more variables, emphasizing its higher complexity and thus superior performance. The top ten features for DT and XGB models show significant overlap, indicating robust concordance. Specific features are identified that distinguish the models, suggesting that a more complex model, like XGB, handles dummy variables more adeptly.
References
- Adadi, A., & Berrada, M. (2018). Peeking inside the black-box: a survey on explainable artificial intelligence (XAI). IEEE access, 6, 52138–52160.
- Ahlfeldt, G. M., Redding, S. J., Sturm, D. M., & Wolf, N. (2015). The economics of density: evidence from the Berlin Wall. Econometrica, 83(6), 2127–2189.
- Alfano, V., & Guarino, M. (2022). A word to the wise analyzing the impact of textual strategies in determining house pricing. Journal of Housing Research, 31(1), 88–112.
- Allard, N., & Hagström, T. (2021). Modern housing valuation: a machine learning approach. Degree project in industrial engineering and management.
- Al-Qawasmi, J. (2022). Machine learning applications in real estate: critical review of recent development. In Maglogiannis, I., Iliadis, L., Macintyre, J., & Cortez, P. (Eds.). Artificial Intelligence Applications and Innovations. AIAI 2022. IFIP Advances in Information and Communication Technology, Vol. 647. Cham, Springer.
- Alsawan, N. M., & Alshurideh, M. T. (2022). The application of artificial intelligence in real estate valuation: a systematic review. In Hassanien, A. E., Snášel, V., Tang, M., Sung, T. W., & Chang, K. C. (Eds.). Proceedings of the 8th International Conference on Advanced Intelligent Systems and Informatics 2022. AISI 2022. Lecture Notes on Data Engineering and Communications Technologies, Vol. 152. Cham, Springer.
- Antipov, E. A., & Pokryshevskaya, E. B. (2012). Mass appraisal of residential apartments: an application of Random forest for valuation and a CART-based approach for model diagnostics. Expert Systems with Applications, 39(2), 1772–1778.
- Athey, S., & Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11, 685–725.
- Awonaike, A., Ghorashi, S. A., & Hammaad, R. (2021, December). A machine learning framework for house price estimation. In Abraham, A., Gandhi, N., Hanne, T., Hong, T. P., Nogueira Rios, T., & Ding, W. (Eds.). Intelligent Systems Design and Applications. ISDA 2021. Lecture Notes in Networks and Systems, Vol. 418. Cham, Springer.
- Baur, K., Rosenfelder, M., & Lutz, B. (2023). Automated real estate valuation with machine learning models using property descriptions. Expert Systems with Applications, 213, 119147.
- Beimer, J., & Francke, M. (2019). Out-of-sample house price prediction by hedonic price models and machine learning algorithms. Real Estate Research Quarterly, 18(2), 13–20.
- Below, S., Beracha, E., & Skiba, H. (2015). Land erosion and coastal home values. Journal of Real Estate Research, 37(4), 499–536.
- Blackley, D. M., Follain, J. R., & Lee, H. (1986). An evaluation of Hedonic Price Indexes for thirty‐four large SMSAs. Real Estate Economics, 14(2), 179–205.
- Breidenbach, P., & Eilers, L. (2018). RWI-GEO-GRID: Socio-economic data on grid level. Jahrbücher für Nationalökonomie und Statistik, 238(6), 609–616.
- Breiman, L. (1997). Arcing the edge. Technical Report 486, pp. 1-14, Statistics Department, University of California at Berkeley.
- Breiman, L. (2003). Statistical modeling: The two cultures. Quality Control and Applied Statistics, 48(1), 81–82.
- Breuer, W., & Steininger, B. I. (2020). Recent trends in real estate research: a comparison of recent working papers and publications using machine learning algorithms. Journal of Business Economics, 90, 963–974.
- Cajias, M., Willwersch, J., Lorenz, F., & Schaefers, W. (2021). Rental pricing of residential market and portfolio data–A hedonic machine learning approach. Real Estate Finance, 38(1), 1–17.
- Can, A. (1992). Specification and estimation of hedonic housing price models. Regional Science and Urban Economics, 22(3), 453–474.
- Carvalho, D. V., Pereira, E. M., & Cardoso, J. S. (2019). Machine learning interpretability: a survey on methods and metrics. Electronics, 8(8), 832.
- Chan, K. W., & Chin, T. L. (2002). A critical review of literature on the hedonic price model and its application to the housing market in Penang. In The Seventh Asian Real Estate Society Conference, Seul (p. 12).
- Chen, T., & Guestrin, C. (2016). Xgboost: a scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘16). Association for Computing Machinery, New York, NY, USA, 785–794.
- Chen, J. H., Ong, C. F., Zheng, L., & Hsu, S. C. (2017). Forecasting spatial dynamics of the housing market using support vector machine. International Journal of Strategic Property Management, 21(3), 273–283.
- Chernobai, E., Reibel, M., & Carney, M. (2011). Nonlinear spatial and temporal effects of highway construction on house prices. The Journal of Real Estate Finance and Economics, 42, 348–370.
- Chin, S., Kahn, M. E., & Moon, H. R. (2020). Estimating the gains from new rail transit investment: a machine learning tree approach. Real Estate Economics, 48(3), 886–914.
- Chun Lin, C., & Mohan, S. B. (2011). Effectiveness comparison of the residential property mass appraisal methodologies in the USA. International Journal of Housing Markets and Analysis, 4(3), 224–243.
- Colwell, P. F., & Dilmore, G. (1999). Who was first? An examination of an early hedonic study. Land Economics, 620–626.
- Conceição, R. Q. (2023). Supervised clustering with SHAP values. Doctoral dissertation, Instituto Superior de Economia e Gestão, Universidade de Lisboa.
- Connellan, O., & James, H. (1998). Estimated realisation price (ERP) by neural networks: forecasting commercial property values. Journal of Property Valuation and Investment, 16(1), 71–86.
- Conway, D., Li, C. Q., Wolch, J., Kahle, C., & Jerrett, M. (2010). A spatial autocorrelation approach for examining the effects of urban greenspace on residential property values. The Journal of Real Estate Finance and Economics, 41, 150–169.
- Court, A. T. (1939). Hedonic price indexes with automotive examples. In The dynamics of automobile demand (pp. 99–117). New York, General Motors Corporation.
- Craven, M., & Shavlik, J. (1995). Extracting tree-structured representations of trained networks. In Advances in Neural Information Processing Systems 8 (NIPS 1995), pp. 24–30.
- Deaconu, A., Buiga, A., & Tothăzan, H. (2022). Real estate valuation models performance in price prediction. International Journal of Strategic Property Management, 26(2), 86–105.
- Delgado-Panadero, Á., Hernández-Lorca, B., García-Ordás, M. T., & Benítez-Andrades, J. A. (2022). Implementing local-explainability in gradient boosting trees: feature contribution. Information Sciences, 589, 199–212.
- Des Rosiers, F., Dubé, J., & Thériault, M. (2011). Do peer effects shape property values?. Journal of Property Investment & Finance, 29(4/5), 510–528.
- Dubin, R. A. (1988). Estimation of regression coefficients in the presence of spatially autocorrelated error terms. The Review of Economics and Statistics, 466–474.
- Duell, J., Fan, X., Burnett, B., Aarts, G., & Zhou, S. M. (2021). A comparison of explanations given by explainable artificial intelligence methods on analysing electronic health records. In 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI), pp. 1–4. IEEE.
- Dumm, R. E., Sirmans, G. S., & Smersh, G. T. (2016). Price variation in waterfront properties over the economic cycle. Journal of Real Estate Research, 38(1), 1–26.
- Dumm, R. E., Sirmans, G. S., & Smersh, G. T. (2018). Sinkholes and residential property prices: Presence, proximity, and density. Journal of Real Estate Research, 40(1), 41–68.
- Dunse, N., & Jones, C. (2002). The existence of office submarkets in cities. Journal of Property Research, 19(2), 159–182.
- Fernández-Avilés, G., Minguez, R., & Montero, J. M. (2012). Geostatistical air pollution indexes in spatial hedonic models: the case of Madrid, Spain. Journal of Real Estate Research, 34(2), 243–274.
- Fisher, A., Rudin, C., & Dominici, F. (2019). All models are wrong, but many are useful: learning a variable’s importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20(177), 1-81.
- Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29(5), 1189–1232.
- Foryś, I. (2022). Machine learning in house price analysis: regression models versus neural networks. Procedia Computer Science, 207, 435–445.
- Goodwin, K. R., La Roche, C. R., & Waller, B. D. (2020). Restrictions versus amenities: the differential impact of home owners associations on property marketability. Journal of Property Research, 37(3), 238–253.
- Grybauskas, A., Pilinkienė, V., & Stundžienė, A. (2021). Predictive analytics using Big Data for the real estate market during the COVID-19 pandemic. Journal of Big Data, 8(1), 1–20.
- Guliker, E., Folmer, E., & van Sinderen, M. (2022). Spatial determinants of real estate appraisals in the Netherlands: A machine learning approach. ISPRS international journal of geo-information, 11(2), 125.
- Haas, G. C. (1922). Sale prices as a basis for farmland appraisal. Technical Bulletin, Vol. 9. University Farm.
- Hamilton, S. E., & Morgan, A. (2010). Integrating lidar, GIS and hedonic price modeling to measure amenity values in urban beach residential property markets. Computers, Environment and Urban Systems, 34(2), 133–141.
- Harfitalia, P., Pujangkoro, S., & Fachrudin, H. T. (2022). Analysis of Factors Affecting the Value of Shophouse in Lubuk Pakam City, Deli Serdang Regency. International Journal of Research and Review, 9(3), 113–118.
- Hastie, T., Tibshirani, R., & Priedman, J. H. (2001). The Elements of Statistical Learning. Data Mining, Inference, and Prediction. New York, Springer.
- Hoen, B., & Atkinson-Palombo, C. (2016). Wind turbines, amenities and disamenitites: a study of home value impacts in densely populated Massachusetts. Journal of Real Estate Research, 38(4), 473–504.
- Huang, T., & Yu, Y. (2014). Sell probabilistic goods? A behavioral explanation for opaque selling. Marketing Science, 33(5), 743–759.
- Hui, E. C., Chau, C. K., Pun, L., & Law, M. Y. (2007). Measuring the neighboring and environmental effects on residential property value: Using spatial weighting matrix. Building and Environment, 42(6), 2333–2343.
- Iban, M. C. (2022). An explainable model for the mass appraisal of residences: the application of tree-based Machine Learning algorithms and interpretation of value determinants. Habitat International, 128, 102660.
- Isakson, H. (2002). The linear algebra of the sales comparison approach. Journal of Real Estate Research, 24(2), 117–128.
- James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning. Vol. 112, p. 18. New York, Springer.
- Jauregui, A., Allen, M. T., & Weeks, H. S. (2019). A spatial analysis of the impact of float distance on the values of canal-front houses. Journal of Real Estate Research, 41(2), 285–318.
- Katzenbeisser, S., & Petitcolas, F. (2016). Information hiding. Artech house.
- Khosravi, M., Arif, S. B., Ghaseminejad, A., Tohidi, H., & Shabanian, H. (2022). Performance evaluation of machine learning regressors for estimating real estate house prices. Available at: https://www.preprints.org/manuscript/202209.0341 (accessed 11 December 2023).
- Knight, J., Sirmans, C., & Turnbull, G. (1998). List price information in residential appraisal and underwriting. Journal of Real Estate Research, 15(1), 59–76.
- Kok, N., Koponen, E. L., & Martínez-Barbosa, C. A. (2017). Big data in real estate? From manual appraisal to automated valuation. The Journal of Portfolio Management, 43(6), 202–211.
- Konstantinov, A. V., & Utkin, L. V. (2021). Interpretable machine learning with an ensemble of gradient boosting machines. Knowledge-Based Systems, 222, 106993.
- Kontrimas, V., & Verikas, A. (2011). The mass appraisal of the real estate by computational intelligence. Applied Soft Computing, 11(1), 443–448.
- Krämer, B., Nagl, C., Stang, M., & Schäfers, W. (2023). Explainable AI in a real estate context–exploring the determinants of residential real estate values. Journal of Housing Research, 32(2), 204–245.
- Kumar, C. S., Choudary, M. N. S., Bommineni, V. B., Tarun, G., & Anjali, T. (2020). Dimensionality reduction based on shap analysis: a simple and trustworthy approach. In 2020 International Conference on Communication and Signal Processing (ICCSP), Chennai, India, 2020, pp. 558-560. IEEE. 2020,
- Kumkar, P., Madan, I., Kale, A., Khanvilkar, O., & Khan, A. (2018). Comparison of ensemble methods for real estate appraisal. In 2018 3rd International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India, pp. 297-300. IEEE.
- Lancaster, K. J. (1966). A new approach to consumer theory. Journal of Political Economy, 74(2), 132–157.
- Linardatos, P., Papastefanopoulos, V., & Kotsiantis, S. (2020). Explainable ai: a review of machine learning interpretability methods. Entropy, 23(1), 18.
- Lorenz, F., Willwersch, J., Cajias, M., & Fuerst, F. (2023). Interpretable machine learning for real estate market analysis. Real Estate Economics, 51(5), 1178–1208.
- Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. In Advances in neural information processing systems 30 (NIPS 2017).
- Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S. I. (2019). Explainable AI for trees: from local explanations to global understanding. arXiv preprint arXiv:1905.04610.
- Malpezzi, S. (2003). Hedonic pricing models: a selective and applied review. In O’Sullivan, T., & Gibb, K. (Eds.). Housing Economics and Public Policy, 67–89. Oxford, Blackwell.
- Mayer, M., Bourassa, S. C., Hoesli, M., & Scognamiglio, D. (2019). Estimation and updating methods for hedonic valuation. Journal of European Real Estate Research, 12(1), 134–150.
- McCluskey, W., Davis, P., Haran, M., McCord, M., & McIlhatton, D. (2012). The potential of artificial neural networks in mass appraisal: the case revisited. Journal of Financial Management of Property and Construction, 17(3), 274–292.
- Merrick, L., & Taly, A. (2020). The explanation game: explaining machine learning models using shapley values. In Machine Learning and Knowledge Extraction: 4th IFIP TC 5, TC 12, WG 8.4, WG 8.9, WG 12.9 International Cross-Domain Conference, CD-MAKE 2020, Dublin, Ireland, August 25–28, 2020, Proceedings 4, 17–38. Springer International Publishing.
- Merrick, L., Taly, A. (2020). The Explanation Game: Explaining Machine Learning Models Using Shapley Values. In: Holzinger, A., Kieseberg, P., Tjoa, A., Weippl, E. (eds) Machine Learning and Knowledge Extraction. CD-MAKE 2020. Lecture Notes in Computer Science (), vol 12279. Springer, Cham.
- Molnar, C. (2022). Interpretable machine learning: a guide for making black box models explainable. 2nd ed. Lulu.com.
- Morgan, J. N., & Sonquist, J. A. (1963). Some results from a non-symmetrical branching process that looks for interaction effects. Young, 8(5).
- Pace, R. K., & Hayunga, D. (2020). Examining the information content of residuals from hedonic and spatial models using trees and forests. The Journal of Real Estate Finance and Economics, 60, 170–180.
- Pérez-Rave, J. I., Correa-Morales, J. C., & González-Echavarría, F. (2019). A machine learning approach to big data regression analysis of real estate prices for inferential and predictive purposes. Journal of Property Research, 36(1), 59–96.
- Piegeler, T., Bauer, S., Ondrusch, S., & von Ditfurth, J. (2021). Knowing what others don’t: gaining a competitive edge in real estate with AI-driven geospatial analytics. Deloitte Insights. Available at: www2. deloitte.com/uk/en/pages/realestate/articles/gaining-a-competitive-edge-in-real-estate.html (accessed 31 August 2023).
- Quinlan, J. R. (1979). Discovering rules by induction from large collections of examples. Expert systems in the micro electronics age. Edinburgh, Edinburgh University Press.
- Rosen, S. (1974). Hedonic prices and implicit markets: product differentiation in pure competition. Journal of Political Economy, 82(1), 34–55.
- Rouwendal, J., Levkovich, O., & Van Marwijk, R. (2017). Estimating the value of proximity to water, when ceteris really is paribus. Real Estate Economics, 45(4), 829–860.
- Ryan, T. P. (2013). Sample size determination and power. Hoboken, John Wiley & Sons.
- Samek, W. (2020). Learning with explainable trees. Nature Machine Intelligence, 2(1), 16–17.
- Sangani, D., Erickson, K., & Al Hasan, M. (2017). Predicting zillow estimation error using linear regression and gradient boosting. In 2017 IEEE 14th International Conference on Mobile Ad Hoc and Sensor Systems (MASS), Orlando, FL, USA, pp. 530–534. IEEE.
- Schaffner, S., & Boelmann, B. (2018). FDZ Data description: Real-Estate Data for Germany (RWI-GEO-RED)-Advertisements on the Internet Platform ImmobilienScout24. RWI Projektberichte, RWI. Leibniz-Institut für Wirtschaftsforschung, Essen
- Schaffner, S., & Thiel, P. (2022). FDZ data description: Real-estate data for Germany (RWI-GEO-RED v7)-Advertisements on the internet platform lmmobilienScout24 2007-06/2022. RWI Datenbeschreibung. RWI – Leibniz-Institut für Wirtschaftsforschung, Essen
- Shen, L., & Springer, T. M. (2022). The odd one out? the impact of property uniqueness on selling time and selling price. Journal of Housing Research, 31(2), 220–240.
- Singh, A., Sharma, A., & Dubey, G. (2020). Big data analytics predicting real estate prices. International Journal of System Assurance Engineering and Management, 11, 208–219.
- Sirmans, S., Macpherson, D., & Zietz, E. (2005). The composition of hedonic pricing models. Journal of Real Estate Literature, 13(1), 1–44.
- Stang, M., Krämer, B., Nagl, C., & Schäfers, W. (2023). From human business to machine learning—methods for automating real estate appraisals and their practical implications. Zeitschrift Für Immobilienökonomie, 9(2), 81–108.
- Stamou, M., Mimis, A., & Rovolis, A. (2017). House price determinants in Athens: a spatial econometric approach. Journal of Property Research, 34(4), 269–284.
- Sundararajan, M., & Najmi, A. (2020). The many Shapley values for model explanation. In Proceedings of the 37th International Conference on Machine Learning, 9269–9278. PMLR.
- Suparman, Y., Folmer, H., & Oud, J. H. (2014). Hedonic price models with omitted variables and measurement errors: a constrained autoregression–structural equation modeling approach with application to urban Indonesia. Journal of Geographical Systems, 16, 49–70.
- Surkov, A., Srinivas, V., & Gregorie, J. (2022). Unleashing the power of machine learning models in banking through explainable artificial intelligence (XAI). Deloitte Insights. Available at: https://www2.deloitte.com/us/en/insights/industry/financial-services/explainable-ai-in-banking.html (accessed 31 August 2023).
- Tekin, M., & Sari, I. U. (2022). Real Estate Market Price Prediction Model of Istanbul. Real Estate Management and Valuation, 30(4), 1–16.
- Theisen, T., & Emblem, A. W. (2018). House prices and proximity to kindergarten–costs of distance and external effects?. Journal of Property Research, 35(4), 321–343.
- Ünel, F. B., & Yalpir, S. (2019). Reduction of mass appraisal criteria with principal component analysis and integration to GIS. International Journal of Engineering and Geosciences, 4(3), 94–105.
- Valier, A. (2020). Who performs better? AVMs vs hedonic models. Journal of Property Investment & Finance, 38(3), 213–225.
- Wang, D., & Li, V. J. (2019). Mass appraisal models of real estate in the 21st century: a systematic literature review. Sustainability, 11(24), 7006.
- Wyman, D., & Mothorpe, C. (2018). The pricing of power lines: a geospatial approach to measuring residential property values. Journal of Real Estate Research, 40(1), 121–154.
- Yavas, A., & Yang, S. (1995). The strategic role of listing price in marketing real estate: theory and evidence. Real Estate Economics, 23(3), 347–368.
- Yilmazer, S., & Kocaman, S. (2020). A mass appraisal assessment study using machine learning based on multiple regression and random forest. Land Use Policy, 99, 104889.
- Yoo, S., Im, J., & Wagner, J. E. (2012). Variable selection for hedonic model using machine learning approaches: a case study in Onondaga County, NY. Landscape and Urban Planning, 107(3), 293–306.
- Zurada, J., Levitan, A., & Guan, J. (2011). A comparison of regression and artificial intelligence methods in a mass appraisal context. Journal of Real Estate Research, 33(3), 349–388.