Predictive Modelling and Factor Analysis of Public Transport Delays in Smart City Using Interpretable Machine Learning
Author: Yurii Matseliukh, Vasyl Lytvyn, Zhengbing Hu, Myroslava Bublyk
Journal: International Journal of Information Technology and Computer Science @ijitcs
Article in issue: 6 Vol. 17, 2025.
Free access
Delay prediction in urban public transport systems is a critical task for improving operational efficiency and service reliability. While numerous predictive models exist, understanding the relative importance of contributing factors remains a challenge, with traditional approaches often overestimating the impact of stochastic weather conditions. This study proposes an approach that combines predictive modelling and factor analysis based on interpretable machine learning. An eXtreme Gradient Boosting model was developed using a large dataset of operational and meteorological data from a city with approximately one million inhabitants. The model demonstrated high predictive accuracy, explaining 72% of the variance in delays (Coefficient of Determination R²=0.72). Analysis of the model’s feature importance revealed that operational cycles (seasonal, weekly, daily) and spatial context (routes, stops) are the dominant predictors, collectively accounting for over 52% of the model’s total feature importance. Contrary to common assumptions, weather conditions were identified as a powerful secondary, rather than primary, factor. While their cumulative feature importance was substantial (contributing nearly 45%), the model revealed their impact to be highly contextual: the negative effects of adverse weather were significantly amplified during predictable peak operational hours but were minimal otherwise. This research demonstrates how Explainable Artificial Intelligence methods can transform complex predictive models into practical tools, providing a data-driven basis for shifting from reactive management to proactive, evidence-based planning.
Public Transport, Interpretable Machine Learning, XGBoost, SHAP, Smart City, Delay Factor Analysis
Short address: https://sciup.org/15020086
IDR: 15020086 | DOI: 10.5815/ijitcs.2025.06.01