Towards replicable mode choice models for transport simulations in France
Sebastian Hörl
21 June 2023
ISTDM 2023
Introduction
- Reproducibility
- Low in transport modelling / simulation, especially with agent-based models
- Can increase acceptance, uptake and more widespread use of these models
- Increasingly available open data sources make reproducibility possible, but processes aren't standardized or not easily accessible as open source
- Our goal: Have pipeline from raw data to a calibrated large-scale agent-based transport simulation that is nearly 100% replicable with reproducible results.
Context
- Existing synthetic demand process
- Open-source pipeline
- Based on open data in France
- Reference implementation for Paris and Île-de-France region
- Adaptations for various places in France by multiple stakeholders
Context
- Existing synthetic demand process
- Open-source pipeline
- Based on open data in France
- Reference implementation for Paris and Île-de-France region
- Adaptations for various places in France by multiple stakeholders
- Compatible with MATSim simulation
Context
-
Problem
- Choice model used in the simulation is not reproducible, hard-coded
- Based on a model for Zurich that has been recalibrated
-
Challenge
- Use available data to estimate a new choice model for Île-de-France
- Generalizable to other use cases
- Use as much open data as possible
- Synthesis pipeline already processes, cleans, harmonizes many HTS data sets in France
?
General process
- Formalize the data processing and estimation process
- Focus on open data sets vs. proprietary APIs
General process: Cleaning
-
Harmonize French HTS data into the same format
- Enquête Globale de Transport (Île-de-France)
- Available upon request
- Paris / Île-de-France
- 2010/2011, new version coming up
- Enquête Nationale Transport et Déplacements (France)
- Open data
- All France
- 2008/2009
- Various semi-standardized surveys designed by CEREMA
- Enquête Déplacement Grand Territoire
- Some open data (Nantes, Lille , ...)
- Others available upon request
- Enquête Globale de Transport (Île-de-France)
- Yielding connected households, persons, trips and legs
General process: Spatialization
- Most HTS do not provide detailed trip origin and destination information (exception EGT)
General process: Spatialization
- Most HTS do not provide detailed trip origin and destination information (exception EGT)
- Possible to impute likely locations based on
- Euclidean distance between origins and destinations along the trip chain
- Identifiers of origin and destination zones
- Shapes of origin and destination zones
-
Balac, M., Hörl, S., Schmid, B., 2022. Discrete choice modeling with anonymized data. Transportation.
General process: Road routing
- Plenty of APIs available (HERE, Bing, Google, TomTom, ...)
-
Goal: Use open data and make process very easy to use
- Based on a OpenStreetMap dump (for instance, from Geofabrik)
- Based on osmnx library in Python
- Problem: Only speed limits are known
General process: Road routing
- We use open information from the TomTom Traffic Index to inflate OSM travel times to realistic ones
Source: TomTom Traffic Index Paris
General process: Road routing
- We use open information from the TomTom Traffic Index to inflate OSM travel times to realistic ones
- Uniform adjustment of the factors based on travel times in the HTS
General process: Transit routing
- Same idea: Avoid the use of APIs, allow for local processing
- Based on GTFS data (usually available in France), sometimes from different periods
- Routing of the trips using the RAPTOR algorithm (standalone implementation in MATSim)
- Problem: How to choose the routing parameters?
General process: Transit routing
- For calibration, we only look at transit trips in the HTS
- We adjust the routing parameters:
- Utility of transfer
- Utility of travel time per mode (bus, tram ...)
- Using CMA-ES blackbox optimization
- Objective:
- Distribution of transfers (0, 1, 2, 3+)
- Mode share of transit modes
General process: Transit routing
- Optimization using CMA-ES
General process: Transit routing
- Fit of the distributions
- Baseline: Minimize travel time (-1 u/h and -1 u/transfer
General process: Additional components
-
Parking pressure based on open data
- Registered vehicles in zone divided by accessible road network
General process: Cost structure
- Need to make hypotheses on the costs (for 2010)
- Car: 20 ct/km
- Parking: 3 EUR/h (Paris 2010, based on duration of following activity)
- Public transport: Per ticket or per duration
- Special case for Île-de-France / Paris
- For free if person has public transit subscription (person attribute)
- 1.80 EUR for trip within Paris or only us or metro
- Otherwise, regression model for regional tickets (Abdelkader DIB, IFPen)
Distances: OP = Origin > Paris; DP = Destination > Paris; D = Direct
Model structure
Model structure
Model structure
Model estimation
- Using Biogeme's Python interface
- 18 parameters
- R2 = 0.53
Simulation
- Model has been implemented in the simulation, so we have first results
- Currently, calibrating network parameters
Conclusion and outlook
- First prototype of the pipeline works for Île-de-France
- Currently packaging up the code and preparing a paper on the baseline case
-
Model improvements
- Improve simplified travel time estimation
- Integrate walking and bicycle routing (but few data available in 2010)
- Include weather information, generally complexify model formulation
-
Porting to other areas in France
- Path 1: Compare models for different areas, hopefully they are similar
- Path 2: Estimate a joint model for France with (ideally non-significant) regional dummies
✓
Questions?
Towards replicable mode choice models for transport simulations in France
By Sebastian Hörl
Towards replicable mode choice models for transport simulations in France
ISTDM 2023, Ispra, June 2023
- 298