Sequence modelling using deep learning approaches for spatiotemporal public transport data.

Conference: Bournemouth University, Faculty of Science and Technology

Abstract:

Encouraging the use of public transport is essential to combat congestion and pollution in an urban environment. To achieve this, the reliability of public transport arrival time prediction should be improved, as this is often requested by passengers. This will make the use of urban bus networks more convenient for passengers and, thus, will play a crucial role in shifting traffic to public transport. Ultimately, this will alleviate pollution and congestion and save a substantial amount of cost to society associated with the use of private cars. Here, the overarching objective was to investigate novel prediction methods and improve predictions for urban bus networks with a focus on short-horizon predictions.

ETA predictions are unreliable due to the lack of good quality historical data, while ‘live’ positions in mobile apps suffer from delays in data transmission. The assessment of different of data quality regimes on the next-step prediction accuracy of Recurrent Neural Networks (RNN) showed that that without data cleaning, model predictions can give false confidence if mean errors are used, highlighting the importance of a holistic assessment of the results. It was demonstrated that noisy data is a problem and simple but effective approaches to address these issues are discussed. It became apparent that RNNs are exceptionally good at predicting stationary positions at either end of a journey. The maximum model improvement of the Sharpe ratio compared to noisy data was 4.71%. This provides insight into the value of addressing data quality issues in urban transport data to enable better predictions and improve the passenger experience.

Furthermore, a comparison of different target representations was tested by encoding targets as unconstrained geographical coordinates, progress along a known trajectory, or ETA at the next two stops. The target representation was shown to affect the accuracy of the prediction by constraining the prediction space and reduced the prediction error from 244.8 to 142.3 m for the Long Short-Term Memory (LSTM) network. This error was further reduced if an ETA was predicted and if a distance is estimated from the ETA error resulted in a a reduction to 4.5 and 14.5 m for the next 2 stops on the route.

Due to the observed lack of data quality, a method was to developed for synthesising data, using a reference curve approach derived from very limited real-world data without a reliable ground truth. This approach allows the controlled introduction of artefacts and noise to simulate their impact on prediction accuracy. To illustrate these impacts, a RNN next-step prediction was used to compare different scenarios in two different UK cities. Two model architectures were used as comparison: a Gated Unit and a LSTM model. Hybrid data was generated where real-world and synthetic data was mixed. When compared to the inference of a model trained purley on synthetic data, the error was reduced from 53.5 to 47.4 m for the LSTM and from 53.4 to 44.0 m for the GRU. The results show that realistic data synthesis is possible, allowing controlled testing of predictive algorithms.

Urban traffic networks are interconnected systems that behave in complex ways to any disturbance. As urban buses operate in such networks and are influenced by traffic within this system, estimated arrival time (ETA) predictions can be challenging and are often inaccurate. To enable the use of network-wide data, a novel model architecture was developed. This attention-mechanism based predictor incorporated the states of other vehicles in the network by encoding their positions using gated recurrent units (GRU) of the individual bus line to encode their current state. By muting specific parts of the imputed information, their impact on prediction accuracy were estimated on a subset of the available data. The results showed that a network-based predictor outperforms models based on a single vehicle or all vehicles of a single line.

However, a model limited to vehicles of the same line ahead of the target was the best performing model, suggesting that the incorporation of additional data can have a negative impact on the prediction accuracy if it does not add any useful information. This could be caused by poor data quality, but also by a lack of interaction between the included lines and the target line. The technical aspects of this architecture are challenging and resulted in a very inefficient training procedure. It can be expected that if a more efficient training regime is developed or the model is trained for a longer time, usable predictive accuracy can be achieved.

https://eprints.bournemouth.ac.uk/37673/

Source: Manual