Public transport planning is a complex process that has a significant impact on the overall quality of the service. One of the key steps in this process is the design of timetables that take into account the transportation resources, law and regulations, and the specific constraints of, for example, a city or region, while satisfying demand and ensuring a smooth experience for passengers. A major weakness of public transit, as perceived by passengers, is the time lost during transfers. This latter can be significant: for example, this represents an average of 23% of travel time for multi-modal trips in the United Kingdom. Studies have shown that this transfer waiting time inherent to the system is poorly perceived by passengers.
In the operations research community, much work has been undertaken in the domain of schedule synchronization. The goal of this work is to coordinate the timetables of the different lines in order to minimize the waiting time at the connections. Most of the operations research approaches focus on the theoretical timetables and give the same importance to all the connections at all times of the day. If the real usage of the system is observed, however, it is clear that some connections are more important than others in terms of the frequency and volume of passengers. This is why transportation authorities often incorporate their expert knowledge of the system, as well as experience's rules of thumbs, to design public transportation timetables. However, with the constant growth and sophistication of public transport, these approaches have reached their limit.
A natural alternative is to use the data that is generated by the transportation system to precisely quantify and model transfer waiting times, while accounting for the stochasticity of the system.
Among the first data-driven approaches to schedule optimization, some methods have used queries to an online trip planner to approximate the real usage of the system in order to build a two-stage stochastic linear program (LP). The idea behind this approach is to compute shifts of the schedules that minimize the expected waiting times across a number of scenarios. A recent extension of this approach has proposed to use real transit data taking into account the fact that passengers optimize their transfers according to the schedules. This led to a two-stage stochastic linear program with mixed-integer variables (MILP).
While these approaches may provide finer and more realistic models, they inevitably lead to very large-scale problems. In one approach, only a sub-network was solved using an open-source MILP solver. Tackling an entire city or region with different transportation modes and a substantial number of scenarios is out of the reach of standard optimization solvers.