Predictive Modelling and Factor Analysis of Public Transport Delays in Smart City Using Interpretable Machine Learning
Автор: Yurii Matseliukh, Vasyl Lytvyn, Zhengbing Hu, Myroslava Bublyk
Журнал: International Journal of Information Technology and Computer Science @ijitcs
Статья в выпуске: 6 Vol. 17, 2025 года.
Бесплатный доступ
Delay prediction in urban public transport systems is a critical task for improving operational efficiency and service reliability. While numerous predictive models exist, understanding the relative importance of contributing factors remains a challenge, with traditional approaches often overestimating the impact of stochastic weather conditions. This study proposes an approach that combines predictive modelling and factor analysis based on interpretable machine learning. An eXtreme Gradient Boosting model was developed using a large dataset of operational and meteorological data from a city with approximately one million inhabitants. The model demonstrated high predictive accuracy, explaining 72% of the variance in delays (Coefficient of Determination R²=0.72). Analysis of the model’s feature importance revealed that operational cycles (seasonal, weekly, daily) and spatial context (routes, stops) are the dominant predictors, collectively accounting for over 52% of the model’s total feature importance. Contrary to common assumptions, weather conditions were identified as a powerful secondary, rather than primary, factor. While their cumulative feature importance was substantial (contributing nearly 45%), the model revealed their impact to be highly contextual: the negative effects of adverse weather were significantly amplified during predictable peak operational hours but were minimal otherwise. This research demonstrates how Explainable Artificial Intelligence methods can transform complex predictive models into practical tools, providing a data-driven basis for shifting from reactive management to proactive, evidence-based planning.
Public Transport, Interpretable Machine Learning, XGBoost, SHAP, Smart City, Delay Factor Analysis
Короткий адрес: https://sciup.org/15020086
IDR: 15020086 | DOI: 10.5815/ijitcs.2025.06.01