Towards replicable mode choice models for transport simulations in France 

Sebastian Hörl

21 June 2023

ISTDM 2023

Introduction

  • Reproducibility
    • Low in transport modelling / simulation, especially with agent-based models
    • Can increase acceptance, uptake and more widespread use of these models
       
  • Increasingly available open data sources make reproducibility possible, but processes aren't standardized or not easily accessible as open source
     
  • Our goal: Have pipeline from raw data to a calibrated large-scale agent-based transport simulation that is nearly 100% replicable with reproducible results.

Context

  • Existing synthetic demand process
    • Open-source pipeline
    • Based on open data in France
    • Reference implementation for Paris and Île-de-France region
       
  • Adaptations for various places in France by multiple stakeholders

Context

  • Existing synthetic demand process
    • Open-source pipeline
    • Based on open data in France
    • Reference implementation for Paris and Île-de-France region
       
  • Adaptations for various places in France by multiple stakeholders
     
  • Compatible with MATSim simulation

Context

  • Problem
    • Choice model used in the simulation is not reproducible, hard-coded
    • Based on a model for Zurich that has been recalibrated
       
  • Challenge
    • Use available data to estimate a new choice model for Île-de-France
    • Generalizable to other use cases
    • Use as much open data as possible
       
  • Synthesis pipeline already processes, cleans, harmonizes many HTS data sets in France

?

General process

  • Formalize the data processing and estimation process
  • Focus on open data sets vs. proprietary APIs

General process: Cleaning

  • Harmonize French HTS data into the same format
    • Enquête Globale de Transport (Île-de-France)
      • Available upon request
      • Paris / Île-de-France
      • 2010/2011, new version coming up
    • Enquête Nationale Transport et Déplacements (France)
      • Open data
      • All France
      • 2008/2009
    • Various semi-standardized surveys designed by CEREMA
      • Enquête Déplacement Grand Territoire
      • Some open data (Nantes, Lille , ...)
      • Others available upon request
         
  • Yielding connected households, persons, trips and legs

General process: Spatialization

  • Most HTS do not provide detailed trip origin and destination information (exception EGT)

General process: Spatialization

  • Most HTS do not provide detailed trip origin and destination information (exception EGT)
     
  • Possible to impute likely locations based on
    • Euclidean distance between origins and destinations along the trip chain
    • Identifiers of origin and destination zones
    • Shapes of origin and destination zones
       
  • Balac, M., Hörl, S., Schmid, B., 2022. Discrete choice modeling with anonymized data. Transportation.

General process: Road routing

  • Plenty of APIs available (HERE, Bing, Google, TomTom, ...)
     
  • Goal: Use open data and make process very easy to use
     
  • Based on a OpenStreetMap dump (for instance, from Geofabrik)
     
  • Based on osmnx library in Python
     
  • Problem: Only speed limits are known

General process: Road routing

  • We use open information from the TomTom Traffic Index to inflate OSM travel times to realistic ones

Source: TomTom Traffic Index Paris

General process: Road routing

  • We use open information from the TomTom Traffic Index to inflate OSM travel times to realistic ones
     
  • Uniform adjustment of the factors based on travel times in the HTS

General process: Transit routing

  • Same idea: Avoid the use of APIs, allow for local processing
     
  • Based on GTFS data (usually available in France), sometimes from different periods
     
  • Routing of the trips using the RAPTOR algorithm (standalone implementation in MATSim)
     
  • Problem: How to choose the routing parameters?

General process: Transit routing

  • For calibration, we only look at transit trips in the HTS
     
  • We adjust the routing parameters:
    • Utility of transfer
    • Utility of travel time per mode (bus, tram ...)
       
  • Using CMA-ES blackbox optimization
     
  • Objective:
    • Distribution of transfers (0, 1, 2, 3+)
    • Mode share of transit modes

General process: Transit routing

  • Optimization using CMA-ES

General process: Transit routing

  • Fit of the distributions
    • Baseline: Minimize travel time (-1 u/h and -1 u/transfer

General process: Additional components

  • Parking pressure based on open data
    • Registered vehicles in zone divided by accessible road network

General process: Cost structure

  • Need to make hypotheses on the costs (for 2010)
    • Car: 20 ct/km
    • Parking: 3 EUR/h (Paris 2010, based on duration of following activity)
    • Public transport: Per ticket or per duration
       
  • Special case for Île-de-France / Paris
    • For free if person has public transit subscription (person attribute)
    • 1.80 EUR for trip within Paris or only us or metro
    • Otherwise, regression model for regional tickets (Abdelkader DIB, IFPen)

Distances: OP = Origin > Paris; DP = Destination > Paris; D = Direct

Model structure

Model structure

Model structure

Model estimation

  • Using Biogeme's Python interface
     
  • 18 parameters
  • R2 = 0.53

Simulation

  • Model has been implemented in the simulation, so we have first results
  • Currently, calibrating network parameters

Conclusion and outlook

  • First prototype of the pipeline works for Île-de-France
  • Currently packaging up the code and preparing a paper on the baseline case





     
  • Model improvements
    • Improve simplified travel time estimation
    • Integrate walking and bicycle routing (but few data available in 2010)
    • Include weather information, generally complexify model formulation
       
  • Porting to other areas in France
    • Path 1: Compare models for different areas, hopefully they are similar
    • Path 2: Estimate a joint model for France with (ideally non-significant) regional dummies

Questions?

Towards replicable mode choice models for transport simulations in France

By Sebastian Hörl

Towards replicable mode choice models for transport simulations in France

ISTDM 2023, Ispra, June 2023

  • 326