4 juin 2025
Travel behaviour modellers have an increasingly diverse set of models at their disposal, ranging from traditional econometric structures to models from mathematical psychology and data-driven approaches from machine learning. A key question arises as to how well these different models perform in prediction, especially when considering trips of different characteristics from those used in estimation, i.e. out-of-distribution prediction, and whether better predictions can be obtained by combining insights from the different models. Across two case studies, we show that while data-driven approaches excel in predicting mode choice for trips within the distance bands used in estimation, beyond that range, the picture is fuzzy. To leverage the relative advantages of the different model families and capitalise on the notion that multiple `weak' models can result in more robust models, we put forward the use of a model averaging approach that allocates weights to different model families as a function of the \emph{distance} between the characteristics of the trip for which predictions are made, and those used in model estimation. Overall, we see that the model averaging approach gives larger weight to models with stronger behavioural or econometric underpinnings the more we move outside the interval of trip distances covered in estimation. Across both case studies, we show that our model averaging approach obtains improved performance both on the estimation and validation data, and crucially also when predicting mode choices for trips of distances outside the range used in estimation.