Regional Statistics Conference 2026

Regional Statistics Conference 2026

Regression-Based Control Charts for Detecting Railway Delay Propagation

Conference

Regional Statistics Conference 2026

Format: CPS Abstract - Malta 2026

Keywords: "anomaly, dynamic-linear-regression, robust inference, statistical quality control

Session: CPS 16 Industry

Friday 5 June 11 a.m. - noon (Europe/Malta)

Abstract

Efficient transportation is crucial for social and economic development, and railways provide a sustainable means of transporting both passengers and goods. One of the main operational challenges in railway systems is maintaining timetable punctuality while meeting planned demand. High punctuality is crucial to make rail transport competitive and attractive. However, in practice, railway delays do not remain isolated, and small disruptions at one station can affect subsequent stations along the line.

In Sweden, the Transport Administration (Trafikverket) aims for 95% of all scheduled trains to have a delay of less than five minutes. Despite this goal, several factors have negatively affected performance in recent years. In particular, the introduction of a new traffic planning system with the 2023 timetable, together with rolling stock shortages and a lack of drivers, caused additional disruptions. As a result, passenger train punctuality in 2023 dropped to just below 88%, while freight train punctuality reached only slightly above 71%.

Train delays in Sweden have been widely considered as uniformly distributed across the railway. However, in reality, delays exhibit skewed distributions, heavy tails, and heterogeneous behavior. Additionally, a previous study in a Swedish railway showed how the statistical properties of train arrivals change across stations and travel directions, suggesting that delays are not uniformly distributed.

In this work, we study how train delays propagate along railway lines and how delays at earlier stations can be used to monitor and anticipate disruptions at downstream stations. Our approach is based on regression models where delays at previous stations are used as explanatory variables. Furthermore, control charts are constructed using the residuals of these models to monitor deviations from expected behavior.

In this study, two different types of control charts are considered. Shewhart individual charts are used to detect large and sudden deviations in train delays, and exponentially weighted moving average (EWMA) charts are employed to identify possible small but persistent changes. These two charts complement each other by targeting different types of disruptions.

To account for the presence of outliers, we compare standard linear regression models with a robust regression model. This comparison enables us to assess the sensitivity of monitoring extreme and minor delays, and also allows us to utilize data with outliers to estimate the models. In practical scenarios, this is important, as the amount of available data is limited, and obtaining clean data for model training is even more challenging.

Preliminary results indicate that delays are mainly correlated with delays at nearby stations, suggesting a spatial dependence along the line. The control charts highlight observations that depart from the expected pattern of behavior. Such deviations are easily identifiable in the chart and can be used to support operational responses when unusual delay patterns occur.

To the best of our knowledge, the use of Shewhart and EWMA control charts for monitoring railway delays is an open topic in the literature. The proposed framework enables the detection of both abrupt disruptions and more gradual changes in delay behavior over time.