Synthetic population workshop

Introduction

Sebastian Hörl

8 November 2024

IRT SystemX

Today's workshop

  • Population synthesis using the eqasim toolkit is gaining interest
    • Transport (obviously)
    • Energy
    • Epidemics
    • ...
       
  • Various institutions and developers / researchers working with it
    • Which components can be contributed?
    • How can we make that easier?
    • What is everybody's roadmap?
    • What are opportunities for collaboration?
       
  • (Today's focus is not the Java / transport simulation part!)

Users & Implementations

Users & Implementations

Contributions

Repository

Commits

Repository

Recent CHANGELOG

Contributing

  1. Make a fork & clone it
git clone git@github.com:user/ile-de-france.git

Contributing

  1. Make a fork & clone it
  2. Make a branch
git branch my_changes

Contributing

  1. Make a fork & clone it
  2. Make a branch
  3. Integrate your changes & commit & push
git commit -m "feat: new great feature"
git push origin my_changes

Contributing

  1. Make a fork & clone it
  2. Make a branch
  3. Integrate your changes & commit & push
  4. Send a Pull Request (PR) on Github

Contributing

  1. Make a fork & clone it
  2. Make a branch
  3. Integrate your changes & commit & push
  4. Send a Pull Request (PR) on Github
  5. Continuous testing will check the code

Contributing

  1. Make a fork & clone it
  2. Make a branch
  3. Integrate your changes & commit & push
  4. Send a Pull Request (PR) on Github
  5. Continuous testing will check the code
  6. Maintainers will review the changes before merging them in

Agenda

Synthetic population workshop

Roadmap

Short-term

  • Update to new INSEE data sets (last update for 2019)
     
  • Update website to give better information of the project

http://www.eqasim.org

Mid-term

  • Currently, large confusion between the two major components of eqasim
     

Demand synthesis

pipeline

Standardized MATSim

with DCM

vs

  • We propose
  • transform into a specific project with focus on population synthesis
  • stays eqasim as a simulation package
  • contribute as much as possible back to matsim-libs
  • In the midterm restructuring of the project is planned
     
  • Currently, we have multiple individual cases with a lot of code duplication, not very modular

Mid-term

+ Munich (see later)

  • In the midterm restructuring of the project is planned
     
  • Step 1: Establish a common library of demand synthesis algorithms and wrappers around existing code and repositories (via dependencies)

Mid-term

  • In the midterm restructuring of the project is planned
     
  • Step 1: Establish a common library of demand synthesis algorithms and wrappers around existing code and repositories (via dependencies)
     
  • Step 2: Replace custom pipeline by established tools such as Snakemake

Mid-term

  • In the midterm restructuring of the project is planned
     
  • Step 1: Establish a common library of demand synthesis algorithms and wrappers around existing code and repositories (via dependencies)
     
  • Step 2: Replace custom pipeline by established tools such as Snakemake
rule step1:
    params:
        text="some text"
    output:
        "step1_output.txt"
    script:
        "step1.py"

rule step2:
    input:
        "step1_output.txt"
    output:
        "step2_output.txt"
    script:
        "step2.py"
snakemake pipeline.smk

Mid-term

  • In the midterm restructuring of the project is planned
     
  • Step 1: Establish a common library of demand synthesis algorithms and wrappers around existing code and repositories (via dependencies)
     
  • Step 2: Replace custom pipeline by established tools such as Snakemake
  • Does not provide the required flexibility

Mid-term

Example: Brussels / Madrid

 

Pro

  • Shows nicely how we can distribute "recipes" of synthesis pipelines that can be reused
     

Contra

  • File-based system is quite constraining
  • Not easy to introduce parameters later on
  • Not easy to perform parametric runs

 

Custom rewrite and improvement of synpp seems like the better alternative

Mid-term

Example: Brussels / Madrid

 

Pro

  • Shows nicely how we can distribute "recipes" of synthesis pipelines that can be reused
     

Contra

  • File-based system is quite constraining
  • Not easy to introduce parameters later on
  • Not easy to perform parametric runs

 

Custom rewrite and improvement of synpp seems like the better alternative

Mid-term

Implementation

  • Drafting and testing beginning of 2025
  • IRT SystemX / ETH Zurich

Mid-term

Synthetic population workshop

Projections by department

Projections by department

  • Previously, we had implemented functionality to use INSEE projection data
  • For each year, marginals of sex and age, France-wide
     
  • Implementation: Perform an Iterative Proportional Updating (IPU) on the households / persons of the census data (2019) to obtain weights that fit the selected age / sex marginals and totals
     
  • Works well country-wide (this is how we validated)
  • BUT: Questions arise regarding the local weighting per department: Were they coherent?

Projections by department

  • Since 2024 INSEE publishes population projections by age, sex, and department, which is better suited to our use case
     
  • Until 2070 for various scenarios (Central, Low growth, High migration, ...)

Projections by department

  • We have updated the pipeline to make use of the new data.
     
  • Same concept: Perform an IPU, but we can now do it per-department, no need to reweight the whole census.
     
  • Gives correct per-department totals

Projections by department

RP2019

INSEE

Synthetic

Projections by department

Projections by department

  • PR is ready, doing some final tests

Synthetic population workshop

Munich implementation

Munich implementation

  • Developed in the framework of the MINGA project at TU Munich
  • Much less data available than in France, often only highly aggregated

 

Idea

  • Start with the Île-de-France pipeline
     
  • Make use of the aliasing functionality of synpp to replace those data sets / pipeline parts that are not available in Germany
     
  • Similar approach followed for Cairo (see last year)
  • But this time, the resulting population is also open and replicable!

Munich implementation

Reminder: aliasing functionality

  • synpp allows you to replace existing stages by new ones. Whenever the original one is requested, the overriding one is provided.
     
  • This way you can override certain functionalities along the pipeline.

Munich implementation

Reminder: aliasing functionality

  • synpp allows you to replace existing stages by new ones. Whenever the original one is requested, the overriding one is provided.
     
  • This way you can override certain functionalities along the pipeline.

Munich implementation

Munich pipeline

  • No detailed census like the RP available in Germany
      > Used marginal data per age x sex x municipality to inject a "fake" RP
      > Iterative Proportional Fitting to fuse various marginal data sets
     
  • No detailed commuter matrix like MOBPRO available in Germany
      > Used marginals per "Bezirk" of employed people
      > Used a Euclidean-distance-based Gravity model to inject a "fake" MOBPRO
     
  • No detailed HTS accessible for us
      > Used activity chains from the ENTD (assuming that structure of chains is similar
     
  • No SIRENE / BPE for activity locations available
      > Used OpenStreetMap data filtered by tag (schools, restaurants, ...)
     
  • Several minor modifications
      > Overriding of car availability, pt subscription sampled from marginal data
     
  • All other pipeline elements run basically without modification!

Munich implementation

Munich pipeline

Munich implementation

Validation

  • A report on the relevant Household Travel Survey from 2011 exists (MiD)
  • It provides various high-level statistics (mode shares, median distances, ...)
     
  • We can use those to compare the generated population and the simulation results
     
  • Note: Simulation required additional steps (calibrating ASCs of the choice model, adjusting capacities in the network, ...)

Munich implementation

Home location density

Munich implementation

Work activity density

Munich implementation

Work trips

Munich implementation

Comparison MiD

Munich implementation

Comparison MiD

Munich implementation

Comparison MiD

Munich implementation

Comparison MiD

Munich implementation

Comparison transit flows

Munich implementation

Comparison car travel times

1000 random car trips across population

Based on TomTom routing API

Munich implementation

Next steps

  • Some last calibration
     
  • Demonstration of some use cases (see policy PR in eqasim-java)
     
  • Publication as open source!

Munich implementation