Large-scale models and simulation methods for transportation
Sebastian Hörl
6 September 2023
Spring 2024
Université Gustave Eiffel
IRT SystemX
- Research foundation situated in Paris (Saclay Campus)
- Focus on fostering digital transition in a range of fields from transport, health, cybersecurity to circular economy
- Transferring research results and tools into active application by development and provision of industry platforms
- Various collaborative projects with multiple French companies (Renault, SNCF, ...) and academic partners (Université Paris Saclay, CentraleSupélec, Université Gustave Eiffel)
- Participation in European projects
Context
- We are facing climate change and transport is one of the largest contributors
Context
- We are facing climate change and transport is one of the largest contributors
- However, we need transport to
- Move people
- Move goods
Context
- We are facing climate change and transport is one of the largest contributors
- However, we need transport to
- Move people
- Move goods
- Various options:
- Move less!
Context
- We are facing climate change and transport is one of the largest contributors
- However, we need transport to
- Move people
- Move goods
- Various options:
- Move less!
- Reduce impact (electric, hydrogen, ...)
Context
- We are facing climate change and transport is one of the largest contributors
- However, we need transport to
- Move people
- Move goods
- Various options:
- Move less!
- Reduce impact (electric, hydrogen, ...)
- Become more efficient
Context
- Traditionally, transport planning asked where to build new highways or metro lines
- Nowadays, the question becomes how to use our infrastructure more efficiently, enabled by digital technology and connectivity
- On-demand transport may be a solution
- Avoid cars standing around 90% of the day
- Make efficient use of resources
- Provide access to formerly decoupled areas
- Replace half-utilized bus lines
- ...
- To understand these systems, we need highly dynamic models and simulations; and fortunately, data is become more and more available!
Goals of this course
-
Goal I: Learn where to find and how to work with large-scale open data in France
-
Goal II: Familiarize yourself with a range of data processing tools from data science, modelling, visualization and mapping
-
Goal III: Get to know modern research and planning tools from agent-based simulation
- Goal IV: Gain basic knowledge in transport planning and how to set up transport models, from raw data to the final model
Topics and tools
- To follow along in the exercises in the final project, you'll need to install a couple of tools
- A working Python environment with Jupyter (ideally conda-based)
- with the following packages
- A working Python environment with Jupyter (ideally conda-based)
Topics and tools
- Later on in the course you will need to install some additional tools for Java
- A working Java IDE like one of the following
- so that we can run
- A working Java IDE like one of the following
IntelliJ
VSCode
Topics and tools
- Finally, for mapping purposes, you may install QGIS
Data
- We will work with various data sets in the exercises
- Some of them are rather large, so it will make sense to download them beforehand!
- Links can be found in the
Agenda and course structure
18 January
19 January
23 January
25 January
8 February
10h45 CM
Trip generation & distribution
14h00 TD
Working with transport data
14h00 CM
Mode choice & Traffic assignment
16h15 TD
Modeling with transport data
14h00 CM
Synthetic populations and demand
16h15 TD
Working with synthetic demand data
14h00 CM
Agent-based transport simulation
16h15 TD
Working with MATSim
14h00 CM
On-demand service simulation
16h15 TD
On-demand service simulation
31 March
Submission of course project
- The exercises are structured so you can code along
- Make sure to attend with your personal computer or in groups with at least one machine
- The course material can be found online
Agenda and course structure
18 January
19 January
23 January
25 January
8 February
10h45 CM
Trip generation & distribution
14h00 TD
Working with transport data
14h00 CM
Mode choice & Traffic assignment
16h15 TD
Modeling with transport data
14h00 CM
Synthetic populations and demand
16h15 TD
Working with synthetic demand data
14h00 CM
Agent-based transport simulation
16h15 TD
Working with MATSim
14h00 CM
Q&A
16h15 TD
Q&A
31 March
Submission of course project
- The exercises are structured so you can code along
- Make sure to attend with your personal computer or in groups with at least one machine
- The course material can be found online
Course project
- The goal of the course project is to show how to set up a (siplified) transport model from scratch, starting with raw data
- You will work on a territory of your choice to set up your own model
- Exploring the territory through a basic data analysis
- Generating the travel demand for the study area
- Performing an agent-based simulation of an on-demand mobility service
- All instructions can be found at the link on the right
- You may start working on your project right after the first TD and you will obtain the knowledge for each exercise on the way through the course
- A report has to be handed in by 31 March 2024
- You may work in groups of up to 4 persons
1.1 Four-step model
Four-step model
- Goal: Analyse how changes in demand, habits, offers, and infrastructure impact the use of the transport system
- Model in four steps that answer four questions:
- Where do travellers come from?
- Where do they go to?
- How do they perform these trips (transport modes)?
- How heavily are services and infrastructure utilized?
Trip generation
Trip distribution
Mode choice
Traffic assignment
Four-step model
-
Question: Where do travellers come from?
-
Goal: Obtain a number of originating trips for a number of zones defined in the study area
- Number of generated trips may depend on the total population, the age distribution in a zone, the distribution of job types, ...
- Usually focus on the morning peak hour
- "How many people go to work at around 8pm?"
- Sometimes also evening peak, midday offpeak, evening off-peak
- Modelling task: Set up a model that yields the number of originating trips at a certain time given specific inputs per origin zone
Trip generation
Trip distribution
Mode choice
Traffic assignment
Four-step model
-
Question: Where do the generated travellers go to?
-
Goal: Obtain a matrix of movements between all zones (origins) to all other zones (destinations)
- Flows between two zones may be affected by
- how well the two zones are connected
- how attractive it is to go to another zone
- For instance, often, the amount and quality of employment in a zone determines if the zone is attractive for commuters
- Modelling task: Set up a model that takes into account characteristics of an origin zone, a destination zone, how well they are connected and which yields the expected flow between these zones.
Trip generation
Trip distribution
Mode choice
Traffic assignment
Four-step model
-
Question: How do people go from one zone to another?
-
Goal: Understand how people decide which mode of transport (and which route) they choose to go from A to B
- Mode choice is heavily impacted by
- travel time
- monetary cost
- number of transfers (public transport)
- waiting time (public transport, on-demand transport)
- ...
- Modelling task: Set up a model, which, given a set of alternative ways of going from A to B with their characteristics, yield the probability of either alternative being chosen.
Trip generation
Trip distribution
Mode choice
Traffic assignment
Four-step model
-
Question: How is the infrastructure impacted by travel decisions?
-
Goal: Find out how many cars make use of the road network or how many travellers the public transport services and how much time it takes to travel
- The last modelling step helps to understand if changes in generation, distribution or mode choice lead to saturation of the system
- Modelling task: Set up a model that determines which roads and transit lines will be used by the travellers and yields the travel times
Trip generation
Trip distribution
Mode choice
Traffic assignment
Four-step model
- It is possible to run the model in an iterative way:
- Accessibility of a zone may impact how likely people are to live there (generation)
- Travel times between two zones may impact how likely people are to work in a specific area (distribution)
- Travel times between two zones may impact which modes of transport people use to move between them (mode choice)
- Accessibility of a zone may impact how likely people are to live there (generation)
- In such a setting, the model is run in a feedback loop until key indicators (travel times, flows, ...) stabilize, demand and supply go then into equilibrium
Trip generation
Trip distribution
Mode choice
Traffic assignment
1.2 Trip generation
Trip generation
- Modelling task: Set up a model that yields the number of originating trips at a certain time given specific inputs per origin zone
Trip generation
Trip distribution
Mode choice
Traffic assignment
Trip generation
- Modelling task: Set up a model that yields the number of originating trips at a certain time given specific inputs per origin zone
Trip generation
Trip distribution
Mode choice
Traffic assignment
Characteristics of zone i
Trip generation
- Modelling task: Set up a model that yields the number of originating trips at a certain time given specific inputs per origin zone
Trip generation
Trip distribution
Mode choice
Traffic assignment
Characteristics of zone i
Model
Trip generation
- Modelling task: Set up a model that yields the number of originating trips at a certain time given specific inputs per origin zone
Trip generation
Trip distribution
Mode choice
Traffic assignment
Generated trips for zone i
Characteristics of zone i
Model
Trip generation
- Example: Growth factor model
- More complex models (ML, DL, ...) exist
- First approach:
Trip generation
Trip distribution
Mode choice
Traffic assignment
Trip generation
- Example: Growth factor model
- More complex models (ML, DL, ...) exist
- First approach:
Trip generation
Trip distribution
Mode choice
Traffic assignment
Population in zone i
Trip generation
- Example: Growth factor model
- More complex models (ML, DL, ...) exist
- First approach:
Trip generation
Trip distribution
Mode choice
Traffic assignment
Population in zone i
Growth factor
Trip generation
- Example: Growth factor model
- More complex models (ML, DL, ...) exist
- First approach:
- How do we estimate the parameter?
Trip generation
Trip distribution
Mode choice
Traffic assignment
Population in zone i
Growth factor
Trip generation
- Example: Growth factor model
- More complex models (ML, DL, ...) exist
- First approach:
- How do we estimate the parameter?
Trip generation
Trip distribution
Mode choice
Traffic assignment
Population in zone i
Growth factor
Reference value
Trip generation
- Example: Growth factor model
- More complex models (ML, DL, ...) exist
- First approach:
- How do we estimate the parameter?
Trip generation
Trip distribution
Mode choice
Traffic assignment
Population in zone i
Growth factor
Linear regression
Ordinary least squares
Trip generation
- Example: Growth factor model
- More complex models (ML, DL, ...) exist
- First approach:
- How do we estimate the parameter?
Trip generation
Trip distribution
Mode choice
Traffic assignment
Population in zone i
Growth factor
Linear regression
Ordinary least squares
Trip generation
- Let's test our model!
Population in Île-de-France by municipality
Source: INSEE RP
Commuters in Île-de-France
Source: INSEE RP
Source: INSEE MOBPRO
12,262,544
5,420,092
Trip generation
- Let's test our model!
Population in Île-de-France by municipality
Source: INSEE RP
Source: INSEE RP
12,262,544
5,420,092
Commuters in Île-de-France
Source: INSEE MOBPRO
Trip generation
- Let's test our model!
Population in Île-de-France by municipality
Source: INSEE RP
Source: INSEE RP
Model results
12,262,544
5,420,092
Commuters in Île-de-France
Source: INSEE MOBPRO
Trip generation
- Let's test our model!
Population in Île-de-France by municipality
Source: INSEE RP
Source: INSEE RP
Difference
Commuters in Île-de-France
Source: INSEE MOBPRO
12,262,544
5,420,092
Trip generation
- Let's test our model!
Population in Île-de-France by municipality
Source: INSEE RP
Source: INSEE RP
Commuters in Île-de-France
Source: INSEE MOBPRO
12,262,544
5,420,092
Trip generation
- Let's test our model!
Population in Île-de-France by municipality
Source: INSEE RP
Source: INSEE RP
Commuters in Île-de-France
Source: INSEE MOBPRO
Are we happy with this model?
12,262,544
5,420,092
Trip generation
- Next try:
Source: INSEE
Population by socio-professional category in Île-de-France
CSP = Catégorie socio-professionelle
The socio-professional category is a common statistical tool in France to perform analyses based on different employment levels in France with eight categories
Trip generation
- Next try:
Population by CSP
Source: INSEE
Population by socio-professional category in Île-de-France
CSP = Catégorie socio-professionelle
The socio-professional category is a common statistical tool in France to perform analyses based on different employment levels in France with eight categories
Trip generation
Growth factor by CSP
Source: INSEE
Population by socio-professional category in Île-de-France
CSP = Catégorie socio-professionelle
The socio-professional category is a common statistical tool in France to perform analyses based on different employment levels in France with eight categories
- Next try:
Population by CSP
Trip generation
- Let's test our model!
Intellectual professions (CSP 3)
Workers (CSP 6)
Employees (CSP 5)
Trip generation
- Let's test our model!
CSP Model
Commuters in Île-de-France
Source: INSEE MOBPRO
Simple model
Trip generation
- Let's test our model!
Commuters in Île-de-France
Source: INSEE MOBPRO
CSP Model
Trip generation
- We can now, for instance, apply the data on a smaller zoning system if we know the number of people in a certain CSP living there
Trip generation
- Note on trip attraction models
- The models presented here can also be used to estimate the number of trips arriving in a zone.
1.3 Trip distribution
Trip distribution
- Modelling task: Set up a model that takes into account characteristics of an origin zone, a destination zone, how well they are connected and which yields the expected flow between these zones.
Trip generation
Trip distribution
Mode choice
Traffic assignment
Trip distribution
- Modelling task: Set up a model that takes into account characteristics of an origin zone, a destination zone, how well they are connected and which yields the expected flow between these zones.
Trip generation
Trip distribution
Mode choice
Traffic assignment
Origin characteristics
Trip distribution
- Modelling task: Set up a model that takes into account characteristics of an origin zone, a destination zone, how well they are connected and which yields the expected flow between these zones.
Trip generation
Trip distribution
Mode choice
Traffic assignment
Destination characteristics
Origin characteristics
Trip distribution
- Modelling task: Set up a model that takes into account characteristics of an origin zone, a destination zone, how well they are connected and which yields the expected flow between these zones.
Trip generation
Trip distribution
Mode choice
Traffic assignment
Destination characteristics
Model
Origin characteristics
Trip distribution
- Modelling task: Set up a model that takes into account characteristics of an origin zone, a destination zone, how well they are connected and which yields the expected flow between these zones.
Trip generation
Trip distribution
Mode choice
Traffic assignment
Destination characteristics
Model
Flow
Origin characteristics
Trip distribution
-
Modelling task: Set up a model that takes into account characteristics of an origin zone, a destination zone, how well they are connected and which yields the expected flow between these zones.
- Flow models are mainly concerned with large flows, for instance, commuters going to work in the peak hour
Trip generation
Trip distribution
Mode choice
Traffic assignment
Destination characteristics
Model
Flow
Origin characteristics
Trip distribution
- We can imagine F as a matrix where rows indicate the origins and columns the destinations
- Row sums are the outflows of the origin zones
- Column sums are the inflows of the destination zones
Trip distribution
- We can imagine F as a matrix where rows indicate the origins and columns the destinations
- Row sums are the outflows of the origin zones
- Column sums are the inflows of the destination zones
- Let's define
Outflow / Origins
Inflow / Destinations
Trip distribution
- We can imagine F as a matrix where rows indicate the origins and columns the destinations
- Row sums are the outflows of the origin zones
- Column sums are the inflows of the destination zones
- Let's define
- We may use a trip generation model to generate the row sums, and we may use a trip attraction model to generation the column sums
Outflow / Origins
Inflow / Destinations
Trip distribution
- Example: Zonal flows in Île-de-France
- 1,287 municipalities in total
- 1,656,369 combinations
- 123,787 combinations are available (7.4%)
Source: INSEE MOBPRO
Paris 13e
Alfortville
Melun
Trip distribution
- Gravity model: The most commonly used model is the Gravity model with the following form:
Trip distribution
- Gravity model: The most commonly used model is the Gravity model with the following form:
Production term
Trip distribution
- Gravity model: The most commonly used model is the Gravity model with the following form:
Production term
Attraction term
Trip distribution
- Gravity model: The most commonly used model is the Gravity model with the following form:
Production term
Attraction term
Friction / Resistance term
Trip distribution
-
Gravity model: The most commonly used model is the Gravity model with the following form:
Production term
Attraction term
Friction / Resistance term
- Production term: Weighs how much trips are produced by a zone
- Attraction term: Weighs how attractive a zone is for a trip to arrive
- Friction term: Quantifies how are it is to get from one zone to another (road, transit, natural obstacles, ...)
Trip distribution
-
Gravity model: The most commonly used model is the Gravity model with the following form:
- The friction term is often estimated stand-alone and upfront
- Simple approach: Friction depends on the distance between two zones
Trip distribution
-
Gravity model: The most commonly used model is the Gravity model with the following form:
- The friction term is often estimated stand-alone and upfront
- Simple approach: Friction depends on the distance between two zones
The probability of observing a commute between two municipalities in Île-de-France decreases exponentially with the distance between these municipalities
Trip distribution
-
Gravity model: The most commonly used model is the Gravity model with the following form:
- The friction term is often estimated stand-alone and upfront
- Simple approach: Friction depends on the distance between two zones
The probability of observing a commute between two municipalities in Île-de-France decreases exponentially with the distance between these municipalities
Trip distribution
-
Gravity model: The most commonly used model is the Gravity model with the following form:
- The friction term is often estimated stand-alone and upfront
- Simple approach: Friction depends on the distance between two zones
- More complex friction terms are possible and are widely used
- Travel time
- Monetary cost
- Others
Trip distribution
-
Gravity model: The most commonly used model is the Gravity model with the following form:
- The use of the gravity model depends on which data is available. The single-constrained gravity model assumes that we have reference data for the outflow of certain zones.
- Combing the gravity model with
- Gives
- And
- This fixed expression for P can be inserted in the main model from above.
Trip distribution
- Single-constrained gravity model
- Any choice for A will now by design of the model yield the correct origin flows
Trip distribution
- Single-constrained gravity model
-
Any choice for A will now by design of the model yield the corrent origin flows
- Simple example: The attraction of a zone is dependent on the employees working in that zone
Trip distribution
- Single-constrained gravity model
-
Any choice for A will now by design of the model yield the correct origin flows
- Simple example: The attraction of a zone is dependent on the employees working in that zone
Emploiment in zone j
Trip distribution
- Single-constrained gravity model
-
Any choice for A will now by design of the model yield the correct origin flows
- Simple example: The attraction of a zone is dependent on the employees working in that zone
Emploiment in zone j
Model parameter
Trip distribution
- Single-constrained gravity model
-
Any choice for A will now by design of the model yield the correct origin flows
- Simple example: The attraction of a zone is dependent on the employees working in that zone
- We use the friction model as defined before
Emploiment in zone j
Model parameter
Trip distribution
- Single-constrained gravity model
- Given some reference flows on the territory, we may now fit the parameter
- Alternatively, we may have some of the destination flows as data to fit
(used in the following example)
Trip distribution
- Example: Île-de-France
Trip distribution
- Example: Île-de-France
Alfortville
Data
Model
Trip distribution
- Sometimes, we may also know the destination flows
- The inflow constraint can be integrated analogously to the origins
- This leads to the double-constrained gravity model
- The values of A and P are fully determined by the observed origin and destination flows. By design, the model produces flow matrices F that have the correct row and column sums.
- The values of A and P are obtained by evaluating the two right-mode functions iteratively until the values stabilize.
Trip distribution
- Example: Île-de-France
1.3 Mode choice
Mode choice
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model, which, given a set of alternative ways of going from A to B with their characteristics, yield the probability of either alternative being chosen.
Mode choice
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model, which, given a set of alternative ways of going from A to B with their characteristics, yield the probability of either alternative being chosen.
Characteristics of alternative k
Mode choice
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model, which, given a set of alternative ways of going from A to B with their characteristics, yield the probability of either alternative being chosen.
Characteristics of alternative k
Probability of choosing k
Mode choice
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model, which, given a set of alternative ways of going from A to B with their characteristics, yield the probability of either alternative being chosen.
Characteristics of alternative k
Probability of choosing k
Model
Mode choice
Mode choice
Mode choice
Mode choice
Mode choice
Mode choice
Mode choice
Mode choice
Mode choice
Mode choice
- Which option will I choose?
- Which option do people statistically choose?
Mode choice
Source: Felix Becker, Institute for Transport Planning and Systems, ETH Zurich.
- A common source for choice models is survey data
Mode choice
- A common source for choice models is survey data
- They are performed in a pivot design: Each person gets to view various combinations of values.
- Difficulty I: How to design the survey?
- How many questions are too many?
- How to cover the whole range of potential values?
- Difficulty II: Respondents
- How to ensure that the survey is representative?
- How to make sure the relevant user groups are covered?
- Difficulty III: Misbehaviour
- What if somebody always selects random options?
- What if somebody always selects the first option?
- What if answers are missing?
Mode choice
- Alternative source to Stated Preference data (SP) is Revealed Preference data (RP)
- In Revealed Preference experiments, the actual choices of the persons are tracked
- What do we observe?
- SP: Hypothetic choices of the respondents
- RP: Actual choices of the respondents
- What is known?
- SP: Full knowledge about all choices
- RP: Only knowledge about the taken choice, all alternatives need to be reconstructed!
Mode choice
- Alternative sources are data sets in which people have been asked about the trips they did during one day
- Alternatives to the chosen option need to be generated a posteriori
- Île-de-France
- Enquête Globale de Transport (EGT)
- (2010, on request); (2015 on yet published)
- Nantes, Lyon, Lille, ...
- Enquête Déplacements Grand Territoire (EDGT)
- Available for various years depending on city
- Sometimes publicly accessible as open data (Nantes, Lille)
- France
-
Enquête Nationale Transports et Déplacements (ENTD)
- From 2008, available as open data
-
Enquête Nationale Transports et Déplacements (ENTD)
Mode choice
- A common mode choice model is the multinomial logit model
- The idea is that the utility of an alternative can be quantified:
Mode choice
- A common mode choice model is the multinomial logit model
- The idea is that the utility of an alternative can be quantified:
Value of variable q
Mode choice
- A common mode choice model is the multinomial logit model
- The idea is that the utility of an alternative can be quantified:
Influence weight of variable q
Value of variable q
Mode choice
- A common mode choice model is the multinomial logit model
- The idea is that the utility of an alternative can be quantified:
Influence weight of variable q
Value of variable q
Systematic utility of alternative k for decision-maker i
Mode choice
- A common mode choice model is the multinomial logit model
- The idea is that the utility of an alternative can be quantified:
- In general, we have K alternatives, we have Q variables (or inputs, like travel time, monetary costs, number of transfers, ...), and N decision-makers (observations)
- Systematic utilities are also called generalized costs
Systematic utility of alternative k for decision-maker i
Influence weight of variable q
Value of variable q
Mode choice
- A common mode choice model is the multinomial logit model
- The idea is that the utility of an alternative can be quantified:
- A rational decision-maker (homo oeconomicus) would then choose the alternative that has the highest utility!
Systematic utility of alternative k for decision-maker i
Influence weight of variable q
Value of variable q
Mode choice
- A common mode choice model is the multinomial logit model
- The idea is that the utility of an alternative can be quantified:
- A rational decision-maker (homo oeconomicus) would then choose the alternative that has the highest utility!
Systematic utility of alternative k for decision-maker i
Influence weight of variable q
Value of variable q
Chosen alternative
Mode choice
-
Example: Choice between two public transport connections
Connection A
Connection B
Mode choice
-
Example: Choice between two public transport connections
Connection A
Connection B
-0.6
-1.0
Mode choice
-
Example: Choice between two public transport connections
Connection A
Connection B
-0.6
-1.0
-0.6 * 20 - 1.0 * 1 = -13
-0.6 * 30 - 1.0 * 0 = -19
Mode choice
-
Example: Choice between two public transport connections
Connection A
Connection B
-0.6
-1.0
-0.6 * 20 - 1.0 * 1 = -13
-0.6 * 30 - 1.0 * 0 = -19
Mode choice
- In vector form, the utility maximization model can be written as:
- To find the correct parameters, would need:
- N observations of decisions taken
- Each observation states a chosen alternative
- Each observation states the respective choice characteristics
- The task is then:
Find
such that
!
Mode choice
- In vector form, the utility maximization model can be written as:
- To find the correct parameters, would need:
- N observations of decisions taken
- Each observation states a chosen alternative
- Each observation states the respective choice characteristics
- The task is then:
Find
such that
Parameters we want to find
!
Mode choice
- In vector form, the utility maximization model can be written as:
- To find the correct parameters, would need:
- N observations of decisions taken
- Each observation states a chosen alternative
- Each observation states the respective choice characteristics
- The task is then:
Find
such that
Parameters we want to find
Characteristics of all alternatives
!
Mode choice
- In vector form, the utility maximization model can be written as:
- To find the correct parameters, would need:
- N observations of decisions taken
- Each observation states a chosen alternative
- Each observation states the respective choice characteristics
- The task is then:
Find
such that
Parameters we want to find
Characteristics of all alternatives
Systematic utility per alternative
!
Mode choice
- In vector form, the utility maximization model can be written as:
- To find the correct parameters, would need:
- N observations of decisions taken
- Each observation states a chosen alternative
- Each observation states the respective choice characteristics
- The task is then:
Find
such that
Parameters we want to find
Characteristics of all alternatives
Systematic utility per alternative
Actual choice taken by the person
!
Mode choice
- In vector form, the utility maximization model can be written as:
- To find the correct parameters, would need:
- N observations of decisions taken
- Each observation states a chosen alternative
- Each observation states the respective choice characteristics
- The task is then:
Find
such that
Parameters we want to find
Characteristics of all alternatives
Systematic utility per alternative
Actual choice taken by the person
!
No exact solution can exist!
Mode choice
-
Idea: No exact solution exists, because human decisions are partly random. What if we introduce randomness into our model?
- Based on our systematic utility, we introduce a random utility
- We have added an independent random error term over all alternatives for all decision-makers
- A straightforward choice would be to use a Normal distribution
- Mathematically, we still select the best alternative, but is this better?
- In any case, we call this a Random Utility Model (RUM)
with
Mode choice
- Is it better? Yes, but we need a special case!
- A straightforward expression can be obtained if we use an Extreme Value Distribution like the Gumbel Distribution
- Then it has been shown that there is a correspondance between:
(Lots of math)
[Daniel McFadden in the 70s]
Mode choice
- Is it better? Yes, but we need a special case!
- A straightforward expression can be obtained if we use an Extreme Value Distribution like the Gumbel Distribution
- Then it has been shown that there is a correspondance between:
(Lots of math)
[Daniel McFadden in the 70s]
Mode choice
- With this finding, discrete choice modeling has been revolutionized in the 70s
- The expression on the right is now a closed-form expression of the probability of choosing alternative y
- The model is called the Multinomial Logit Model. It is the most commonly used approach for choice modelling today (with a large variety of extensions and versions).
- Why has it been so impactful? Because it allows us to estimate the model parameters that are hidden in the systematic utility v in the equation above.
Mode choice
- With this finding, discrete choice modeling has been revolutionized in the 70s
- The expression on the right is now a closed-form expression of the probability of choosing alternative y
- The model is called the Multinomial Logit Model. It is the most commonly used approach for choice modelling today (with a large variety of extensions and versions).
- Why has it been so impactful? Because it allows us to estimate the model parameters that are hidden in the systematic utility v in the equation above.
Mode choice
- Having a closed-form expression allows us to perform a Maximum Likelihood Estimation (MLE) of the model parameters. For that we set up the likelihood function of the model:
- It answers: How well does the model (with given parameters) explain the choices?
- The maximum likelihood estimate for the model parameters is then:
- It can be found using standard methods such as Gradient Descent or Newton-Raphson
Mode choice
-
More elaborate example
Mode choice
- Utility is an abstract mathematical concept. On the contrary, most choice characteristics have a unit, for instance distance, travel time, money
- Let's redefine , i.e. we transform all parameters into a new space
- With we get the Value of Travel Time Savings
[1/min] * [min]
[1/EUR] * [EUR]
[EUR/min]
[1]
Mode choice
- The Value of Travel Time Savings (VTTS) explains how much money persons (on average) would be willing to pay extra if the travel time on a specific connection is reduced by a certain amount of time.
- Other interpretation: How uncomfortable is it to spend a certain duration in one mode of transport vs. another one? The VTTS is the higher, the more "costly" or uncomfortable it is to take a trip using this transport mode.
- VTTS are relatively stable over different surveys and countries after accounting for the respective currencies.
Mode choice
13 CHF/h
AMoD
Taxi
19 CHF/h
Conventional
Car
12 CHF/h
Public
Transport
AMoD
- Remember the survey from before on the introduction of automated taxis in the transport system of Zurich?
Mode choice
13 CHF/h
AMoD
Taxi
19 CHF/h
Conventional
Car
12 CHF/h
Public
Transport
AMoD
- Remember the survey from before on the introduction of automated taxis in the transport system of Zurich?
21 CHF/h
32 CHF/h
Mode choice
13 CHF/h
AMoD
Taxi
19 CHF/h
Conventional
Car
12 CHF/h
Public
Transport
AMoD
- Remember the survey from before on the introduction of automated taxis in the transport system of Zurich?
Note on ML/DL models
- VTTS only measurable because of linear structure of the utilities
21 CHF/h
32 CHF/h
Mode choice
- What is the value of decreasing the travel time Paris to Clermont by 30 minutes?
- Standardized VTTS, for instance (Meunier & Quinet, 2015)
- VTTS = 21.8 EUR/h
Mode choice
- Finally, how do we use the Multinomial Logit model to make a decision for a new choice situation?
-
Option 1: Direct sampling
- Calculate systematic utilities
- Calculate probabilities
- Sample one alternative
- Calculate systematic utilities
Mode choice
- Finally, how do we use the Multinomial Logit model to make a decision for a new choice situation?
-
Option 2: Error sampling
- Calculate systematic utilities
- Sample error terms
- Select the maximum
- Calculate systematic utilities
Mode choice
- There is a large field of discrete choice modelling with specific journals
- What happens if we replace the EV distribution by a Normal distribution?
- We get a Multinomial Probit model
- There is no analytical likelihood function any more, but simulation stays straight-forward
- Possibility to flexibly model the error term
- Correlations between choice alternatives
(if I like the red bus, I also like the blue bus)
- More complex formulations of the MNL exist
- Mainly to disentangle the above-mentioned correlation structure of errors
- Nested logit model (first I choose that I use the bus, then if I prefer red or blue)
-
Cross-nested logit model (an automated taxi is like a bus, but also a bit like a car)
- Multiplicative error terms ...
- Parameters (β) are not static but follow a distribution themselves ...
1.4 Traffic assignment
Traffic assignment
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model that determines which roads and transit lines will be used by the travellers and yields the travel times
Traffic assignment
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model that determines which roads and transit lines will be used by the travellers and yields the travel times
Movements from zone r to zone s
Traffic assignment
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model that determines which roads and transit lines will be used by the travellers and yields the travel times
Movements from zone r to zone s
Travel times on road a
Traffic assignment
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model that determines which roads and transit lines will be used by the travellers and yields the travel times
Movements from zone r to zone s
Travel times on road a
Vehicle flow on road a
Traffic assignment
Trip generation
Trip distribution
Mode choice
Traffic assignment
- Modelling task: Set up a model that determines which roads and transit lines will be used by the travellers and yields the travel times
Traffic assignment
Trip generation
Trip distribution
Mode choice
Traffic assignment
-
Modelling task: Set up a model that determines which roads and transit lines will be used by the travellers and yields the travel times
- The model is based on a rational decision maker: Individually, I will follow the route that minimizes my personal travel time.
- This is the game-theoretic Wardrop principle
- Travel times in the network are described through volume-delay functions:
Traffic assignment
-
Example: Two routes and Wardrop equilibrium
- Travel demand from S to E:
- Travel time on Route A (normal road):
- Travel time on Route B (highway):
- Travel demand from S to E:
S
E
Route A
Route B
Traffic assignment
-
Example: Two routes and Wardrop equilibrium
- Travel demand from S to E:
- Travel time on Route A (normal road):
- Travel time on Route B (highway):
- Travel demand from S to E:
S
E
Route A
Route B
How many cars use routes A and B and what is the travel time?
Traffic assignment
-
Example: Two routes and Wardrop equilibrium
- Travel demand from S to E:
- Travel time on Route A (normal road):
- Travel time on Route B (highway):
- Flow must be distributed over the routes:
- Wardrop: If the travel time on Route A is quicker than Route B, I would not use Route A, but rather Route B!
- Travel demand from S to E:
S
E
Route A
Route B
Traffic assignment
-
Example: Two routes and Wardrop equilibrium
- Travel demand from S to E:
- Travel time on Route A (normal road):
- Travel time on Route B (highway):
- Flow must be distributed over the routes:
- Wardrop: If the travel time on Route A is quicker than Route B, I would not use Route A, but rather Route B!
- Travel demand from S to E:
S
E
Route A
Route B
Traffic assignment
-
Example: Two routes and Wardrop equilibrium
- Travel demand from S to E:
- Travel time on Route A (normal road):
- Travel time on Route B (highway):
- Flow must be distributed over the routes:
- Wardrop: If the travel time on Route A is quicker than Route B, I would not use Route A, but rather Route B!
- Travel demand from S to E:
S
E
Route A
Route B
How many cars use routes A and B and what is the travel time?
Traffic assignment
- General case yields a complex optimization problem
Traffic assignment
- General case yields a complex optimization problem
There are k different routes to go from origin r to destination s and the route flow must be non-negative
Traffic assignment
- General case yields a complex optimization problem
There are k different routes to go from origin r to destination s and the route flow must be non-negative
The flow on the route alternatives k between r and s must some to the overall zonal flow between r and s
Traffic assignment
- General case yields a complex optimization problem
There are k different routes to go from origin r to destination s and the route flow must be non-negative
The flow on the route alternatives k between r and s must some to the overall zonal flow between r and s
Does route k between r and s pass through link a?
Traffic assignment
- General case yields a complex optimization problem
There are k different routes to go from origin r to destination s and the route flow must be non-negative
The flow on the route alternatives k between r and s must some to the overall zonal flow between r and s
Does route k between r and s pass through link a?
The link flow of a is the sum of all route flows passing through
Traffic assignment
- General case yields a complex optimization problem
All link flows must be non-negative
There are k different routes to go from origin r to destination s and the route flow must be non-negative
The flow on the route alternatives k between r and s must some to the overall zonal flow between r and s
Does route k between r and s pass through link a?
The link flow of a is the sum of all route flows passing through
Traffic assignment
- General case yields a complex optimization problem
All link flows must be non-negative
There are k different routes to go from origin r to destination s and the route flow must be non-negative
The flow on the route alternatives k between r and s must some to the overall zonal flow between r and s
Does route k between r and s pass through link a?
The link flow of a is the sum of all route flows passing through
The "first" vehicle on link a as low travel time, the "second" one a bit longer, and so on ...
Traffic assignment
- General case yields a complex optimization problem
- The problem can not be solved analytically - it needs to be simulated!
- Various approaches with different complexity exist:
- Method of Successive Averages (MSA)
- Frank-Wolfe Assignment
- Biconjugate Frank-Wolfe Assignment
Traffic assignment
Sketch for Method of Successive Averages (MSA)
- Find the quickest path for all (r,s) pairs under freeflow condtiions and note down the link flows
- Calculate the resulting travel times on all links
- Based on these travel times, recalculate the quickest paths for all (r,s) pairs and note down the updated link flows as
- Update the link flows for the next iteration as
- Continue at 2 until convergence
- The lambda parameter should be decreasing over iterations (to avoid oscillations)
2.1 Disaggregated demand
Disaggregated demand
- The representation of flows between zones has various disadvantages
- Difficult spatial interpretation: Which point is representative for a zone?
- Difficult temporal interpretation: What does "at peak hour" mean?
- We can only use a few user groups
- Difficult spatial interpretation: Which point is representative for a zone?
Disaggregated demand
- One option to have more detail are individual trip-based models
- We represent individual trips that are performed by travellers during the day
- Each trip has origin coordinates and destination coordinates
- Each trip has a departure time
- Each trip may even have individual traveller attribute
- We represent individual trips that are performed by travellers during the day
Disaggregated demand
- How to get from a zone-based four-step model to a trip-based model?
- Option 1: Use the flow matrix as the nominal number of flows
- Note down on trip for each flow unit indicated in the flow matrix
- Note down on trip for each flow unit indicated in the flow matrix
- Option 2: Use the flow matrix as a probability matrix
- Sample individual movements with an origin zone and a destination zone
- Sample individual movements with an origin zone and a destination zone
- Sample a coordinate within the origin zone and the destination zone
- Sample a departure time for each trip
- Option 1: Use the flow matrix as the nominal number of flows
Disaggregated demand
- Example: Transforming the flow matrix into a trip table
(1) Flow matrix
Disaggregated demand
- Example: Transforming the flow matrix into a trip table
(1) Flow matrix
(2) Long format
Disaggregated demand
- Example: Transforming the flow matrix into a trip table
(1) Flow matrix
(3) Probability
Disaggregated demand
- Example: Transforming the flow matrix into a trip table
(4) Sampling
Disaggregated demand
- Sampling from a polygon
- Let be the bottom corner of the bounding box of the polygon
- Let be the width and the height of the bounding box
- Let be the bottom corner of the bounding box of the polygon
- Draw two values from a uniform distribution on [0,1]
- Calculate coordinates as and
- Check if is within the polygon (using a library like shapely)
- Accept the coordinate if within the polygon, or proceed with (1)
* geopandas has a new method called sample_points
Disaggregated demand
- Sampling from a polygon
- Draw two values from a uniform distribution on [0,1]
- Calculate coordinates as and
- Check if is within the polygon (using a library like shapely)
- Accept the coordinate if within the polygon, or proceed with (1)
Disaggregated demand
- Sampling from a polygon
- Draw two values from a uniform distribution on [0,1]
- Calculate coordinates as and
- Check if is within the polygon (using a library like shapely)
- Accept the coordinate if within the polygon, or proceed with (1)
Disaggregated demand
- Sampling from a polygon
- Draw two values from a uniform distribution on [0,1]
- Calculate coordinates as and
- Check if is within the polygon (using a library like shapely)
- Accept the coordinate if within the polygon, or proceed with (1)
Disaggregated demand
- Sampling from a polygon
- Draw two values from a uniform distribution on [0,1]
- Calculate coordinates as and
- Check if is within the polygon (using a library like shapely)
- Accept the coordinate if within the polygon, or proceed with (1)
Disaggregated demand
- Example: Sampling trips in Paris
- The sampling approaches allows for downsampling of the simulation (only simulating a certain share of movements)
N = 100
Disaggregated demand
- Example: Sampling trips in Paris
- The sampling approaches allows for downsampling of the simulation (only simulating a certain share of movements)
N = 100
N = 1,000
Disaggregated demand
- Example: Sampling trips in Paris
- The sampling approaches allows for downsampling of the simulation (only simulating a certain share of movements)
N = 100
N = 1,000
N = 100,000
N = 10,000
Disaggregated demand
- Sampling of the departure times
- Option 1: We make a hypothesis (here for the morning peak)
Disaggregated demand
- Sampling of the departure times
- Option 2: We compare with data, or sample directly from data
EGT: Household travel survey for Île-de-France (not open)
ENTD: National household travel survey
Disaggregated demand
- Sampling of the departure times
- Option 2: We compare with data, or sample directly from data
EGT: Household travel survey for Île-de-France (not open)
ENTD: National household travel survey
Mixture of three Gaussians
Disaggregated demand
- Further steps
- Sampling an age for each trip?
- Sampling a socioprofessional category for each trip?
- Sampling which modes of transport are available for each trip?
- Can be conditioned on the distributions in the origin or destination zones
- Remark: This is a common approach when working with Call Detail Records (CDR).
- Operators track the runtime between your phone and surrounding cell towers
- Having at least three towers allows for approximate triangulation of your position
- Since each phone has an ID, a trajectory can be constructed per user
- BUT: The ID must be anonymized and GDPR forbids attaching user information
- However, the traces can be enriched by statistical data
Disaggregated demand
- Given that we have origin and destination coordinates, we can now perform a detailed routing of the trips on the road network, for instance by finding the shortest (in terms of distance) path.
N = 100
N = 1,000
N = 10,000
Disaggregated demand
- Given that we have origin and destination coordinates, we can now perform a detailed routing of the trips on the road network, for instance by finding the shortest (in terms of distance) path.
N = 100
N = 1,000
N = 10,000
Disaggregated demand
- Given that we have origin and destination coordinates, we can now perform a detailed routing of the trips on the road network, for instance by finding the shortest (in terms of distance) path.
N = 100
N = 1,000
N = 10,000
Disaggregated demand
- Given that we have origin and destination coordinates, we can now perform a detailed routing of the trips on the road network, for instance by finding the shortest (in terms of distance) path.
N = 100
N = 1,000
N = 10,000
Disaggregated demand
- Given that we have origin and destination coordinates, we can now perform a detailed routing of the trips on the road network, for instance by finding the shortest (in terms of distance) path.
- Common routing algorithms for the road network:
- Dijsktra (from the 50s)
Classic, easy to understand
- A* (from the 60s)
Speed-up through the use of heuristics
- ALT (A*, Landmarks, and Triangle inequality, 2003)
Intelligently constraining the search space of A* for heavy speed-up
- Contraction Hierarchies (2008)
Heavy preprocessing of the network, but very fast lookup of routes
- Dijsktra (from the 50s)
Disaggregated demand
- Given that we have origin and destination coordinates, we can now perform a detailed routing of the trips on the road network, for instance by finding the shortest (in terms of distance) path.
- Common routing algorithms for the transit network:
- Time-extended Dijkstra
Long time standard approach, very time and memory consuming
- RAPTOR
Standard today, labelling algorithm
- Connection scan (2015)
Intelligent arrangement of transit schedules in memory
- Time-extended Dijkstra
Disaggregated demand
- From the 2000s, change of paradigm:
Disaggregated demand
- From the 2000s, change of paradigm:
Transport demand is generated by the need of people to perform activities
Disaggregated demand
- From the 2000s, change of paradigm:
- This change of perspective lead to activity-based models
- Individual persons are modelled (which individual attributes like age, CSP, ...)
Transport demand is generated by the need of people to perform activities
x 12,000,000
Disaggregated demand
- From the 2000s, change of paradigm:
- This change of perspective lead to activity-based models
- Individual persons are modelled (which individual attributes like age, CSP, ...)
- Each person has a chain of activities during the day (home - work - shopping - home)
- Individual persons are modelled (which individual attributes like age, CSP, ...)
Transport demand is generated by the need of people to perform activities
Disaggregated demand
- From the 2000s, change of paradigm:
- This change of perspective lead to activity-based models
- Individual persons are modelled (which individual attributes like age, CSP, ...)
- Each person has a chain of activities during the day (home - work - shopping - home)
- Each activity has specific coordinates and a time at which it should happen
- Individual persons are modelled (which individual attributes like age, CSP, ...)
Transport demand is generated by the need of people to perform activities
0:00 - 9:00
10:00 - 17:30
17:45 - 21:00
22:00 - 0:00
Disaggregated demand
- From the 2000s, change of paradigm:
- This change of perspective lead to activity-based models
- Individual persons are modelled (which individual attributes like age, CSP, ...)
- Each person has a chain of activities during the day (home - work - shopping - home)
- Each activity has specific coordinates and a time at which it should happen
- Activities are connected by trips that bring people from A to B (with a specific transport mode)
- Individual persons are modelled (which individual attributes like age, CSP, ...)
Transport demand is generated by the need of people to perform activities
0:00 - 9:00
10:00 - 17:30
17:45 - 21:00
22:00 - 0:00
Disaggregated demand
- The approach results in data sets that are useful for many different domains
-
Synthetic population: Households, persons and their sociodemographic attributes
-
Synthetic demand: Activity chains for each person
-
Synthetic population: Households, persons and their sociodemographic attributes
- Models can be set up for
- Energy consumption of the population
- Evacuation of natural disasters
- Spread of diseases and epidemics (heavily used during Covid-19!)
- Of course, transport planning and simulation
- Energy consumption of the population
2.2 Synthetic travel demand for France
Synthetic travel demand for France
Population census (RP)
> Truncate-Replicate-Sample (TRS)
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
> Imputation by quantile
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
> Direct sampling from OD matrix
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
Household travel survey (EDGT)
0:00 - 8:00
08:30 - 17:00
17:30 - 0:00
0:00 - 9:00
10:00 - 17:30
17:45 - 21:00
22:00 - 0:00
> Assignment of activity chains through statistical matching
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
Household travel survey (EDGT)
Enterprise census (SIRENE)
Address database (BD-TOPO)
> Specifically designed approach to find secondary locations
Hörl, S., Axhausen, K.W., 2021. Relaxation–discretization algorithm for spatially constrained secondary location assignment. Transportmetrica A: Transport Science 1–20. https://doi.org/10.1080/23249935.2021.1982068
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
Household travel survey (EDGT)
Enterprise census (SIRENE)
Address database (BD-TOPO)
Person ID
Age
Gender
Home (X,Y)
1
43
male
(65345, ...)
2
24
female
(65345, ...)
3
9
female
(65345, ...)
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
Household travel survey (EDGT)
Enterprise census (SIRENE)
Address database (BD-TOPO)
Person ID
Activity
Start
End
Loc.
523
home
08:00
(x,y)
523
work
08:55
18:12
(x,y)
523
shop
19:10
19:25
(x,y)
523
home
19:40
(x,y)
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
Household travel survey (EDGT)
Enterprise census (SIRENE)
OpenStreetMap
GTFS (SYTRAL / SNCF)
Address database (BD-TOPO)
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
National HTS (ENTD)
Enterprise census (SIRENE)
OpenStreetMap
GTFS (SYTRAL / SNCF)
Address database (BD-TOPO)
EDGT
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
National HTS (ENTD)
Enterprise census (SIRENE)
OpenStreetMap
GTFS (SYTRAL / SNCF)
Address database (BD-TOPO)
EDGT
Open
Data
Open
Software
+
=
Reproducible research
Integrated testing
Synthetic travel demand for France
Population census (RP)
Income data (FiLoSoFi)
Commuting data (RP-MOB)
National HTS (ENTD)
Enterprise census (SIRENE)
OpenStreetMap
GTFS (SYTRAL / SNCF)
Address database (BD-TOPO)
EDGT
Open
Data
Open
Software
+
=
Reproducible research
Integrated testing
Synthetic travel demand for France
- Comparison of population attributes
Synthetic travel demand for France
- Activity chains
Synthetic travel demand for France
Synthetic travel demand for France
Nantes
- Noise modeling
Synthetic travel demand for France
Lille
- Park & ride applications
- Road pricing
Synthetic travel demand for France
Toulouse
- Placement and use of shared offices
Synthetic travel demand for France
Rennes
- Micromobility simulation
Synthetic travel demand for France
Paris / Île-de-France
- Scenario development for sustainable urban transformation
- New mobility services
Mahdi Zargayouna (GRETTIA / Univ. Gustave Eiffel)
Nicolas Coulombel (LVMT / ENPC)
Synthetic travel demand for France
Paris / Île-de-France
- Cycling simulation
Synthetic travel demand for France
Paris / Île-de-France
- Simulation of dynamic mobility services
- Fleet control through reinforcement learning
Synthetic travel demand for France
Lyon (IRT SystemX)
- Low-emission first/last mile logistics
Synthetic travel demand for France
Synthetic travel demand for France
Balac, M., Hörl, S. (2021) Synthetic population for the state of California based on open-data: examples of San Francisco Bay area and San Diego County, presented at 100th Annual Meeting of the Transportation Research Board, Washington, D.C.
Sallard, A., Balac, M., Hörl, S. (2021) Synthetic travel demand for the Greater São Paulo Metropolitan Region, based on open data, Under Review
Sao Paulo, San Francisco Bay area, Los Angeles five-county area, Switzerland, Montreal, Quebec City, Jakarta, Casablanca, ...
Synthetic travel demand for France
- Latest addition: Cairo
- Germany work-in-progress
Synthetic travel demand for France
- Reproducibility
- Low in transport modelling / simulation, especially with agent-based models
- Can increase acceptance, uptake and more widespread use of these models
- Increasingly available open data sources make reproducibility possible, but processes aren't standardized or not easily accessible as open source
- Our goal: Have pipeline from raw data to a calibrated large-scale agent-based transport simulation that is nearly 100% replicable with reproducible results.
Synthetic travel demand for France
- Continuously updated
- Example: Integrating buildings
- Open data: BAN (Base d'addresses nationale)
- Open data: BD-TOPO building census
- Benchmarking methodology
- Bayesian networks, HMMs, deep learning, ...
Synthetic travel demand for France
- 5% Sample of the data is available on Mendeley
3.1 Agent-based transport simulation
Agent-based transport simulation
- Previously, we have seen how one can generate disaggregated synthetic travel demand
- Such demand data sets are used in agent-based simulation
Agent-based transport simulation
- One option: microscopic simulation
- Each agent is simulated in detail
- Cars have a location
- Velocity and acceleration
- They move according to what they perceive around them
Agent-based transport simulation
- One option: microscopic simulation
- Each agent is simulated in detail
- Cars have a location
- Velocity and acceleration
- They move according to what they perceive around them
- A commonly used model is the Intelligent Driver Model
Agent-based transport simulation
- One option: microscopic simulation
- Each agent is simulated in detail
- Cars have a location
- Velocity and acceleration
- They move according to what they perceive around them
- A commonly used model is the Intelligent Driver Model
is your current speed, and is the speed limit
Agent-based transport simulation
- One option: microscopic simulation
- Each agent is simulated in detail
- Cars have a location
- Velocity and acceleration
- They move according to what they perceive around them
- A commonly used model is the Intelligent Driver Model
is your current distance to the leading car , and is the desired distance
Agent-based transport simulation
- This video shows the driver behavior that is modeled by the IDM
Agent-based transport simulation
- Microscopic models
- Represent traffic in high detail
- Are highly realistic
- Can be calibrated directly on driver behavior (drone videos, ...)
- But
- Are computational heavy!
- Need to be simulated on a sub-second basis
- With many cars (12M agents in Île-de-France, this becomes infeasible!)
Agent-based transport simulation
- Other option: Mesoscopic models
- Agents interact with a network that consists of vertices and edges (nodes and links)
- Agents move through one link and then wait at the end to be transferred to the next one
- If there is no space on the next link, agents may be blocked temporarily, thus creating a traffic jam
3.2 MATSim
MATSim
- Flexible, extensible and well-tested open-source transport simulation framework
- Used by many research groups and companies all over the world
- Extensions for parking behaviour, signal control, location choice, freight, ...
http://www.matsim.org
matsim-org/matsim-libs
MATSim
- Flexible, extensible and well-tested open-source transport simulation framework
- Used by many research groups and companies all over the world
- Extensions for parking behaviour, signal control, location choice, freight, ...
http://www.matsim.org
matsim-org/matsim-libs
MATSim
- Two major components
-
Mobility simulation: mesoscopic simulation of all movements in the transport system
- Decision-making: agents tracking how well their routing and mode decisions worked, then making new decisions
Mobility simulation
Decision-making
Analysis
Scenario
MATSim
- Flexible, extensible and well-tested open-source transport simulation framework
- Used by many research groups and companies all over the world
- Extensions for parking behaviour, signal control, location choice, freight, ...
Mobility simulation
Decision-making
Analysis
Scenario
MATSim
- Link-based mobility simulation
- Each link consists of a queue and a buffer
- Vehicles enter the queue and wait until they are allowed to pass through the link (flow capacity) into the buffer
- Vehicles wait inside the buffer until there is space (storage capacity) on the next link of their route
MATSim
- Link-based mobility simulation
- Each link consists of a queue and a buffer
- Vehicles enter the queue and wait until they are allowed to pass through the link (flow capacity) into the buffer
- Vehicles wait inside the buffer until there is space (storage capacity) on the next link of their route
MATSim
-
Decision-making is based on two concepts
-
Scoring: Agents track how their daily plan works (delays? congestion?)
- Agents gain positive score when performing activities at the time they are planned, and lose score when they spend time in transport
- Similar to the utility concept of mode choice models
- Replanning: Agents keep a memory of past observed plans and select among them or introduce small changes (change transport mode, depart earlier, ...)
MATSim
-
Decision-making is based on two concepts
-
Scoring: Agents track how their daily plan works (delays? congestion?)
- Agents gain positive score when performing activities at the time they are planned, and lose score when they spend time in transport
- Similar to the utility concept of mode choice models
- Replanning: Agents keep a memory of past observed plans and select among them or introduce small changes (change transport mode, depart earlier, ...)
MATSim
MATSim
- A basic MATSim simulation needs two major inputs: plans and a network
- A plans file in XML format that describes the daily schedule of each person
MATSim
- A basic MATSim simulation needs two major inputs: plans and a network
- A plans file in XML format that describes the daily schedule of each person
MATSim
- A basic MATSim simulation needs two major inputs: plans and a network
- A plans file in XML format that describes the daily schedule of each person
- A plans file can be generated by directly writing out a trip table as XML
MATSim
- A basic MATSim simulation needs two major inputs: plans and a network
- A network file describes the road network as edges and nodes
MATSim
- A basic MATSim simulation needs two major inputs: plans and a network
- A network file describes the road network as edges and nodes
MATSim
- A basic MATSim simulation needs two major inputs: plans and a network
- A network file describes the road network as edges and nodes
MATSim
- MATSim can either be configured using an XML-based configuration file or a directly inside the code. A common run script looks as follows:
MATSim
- In a full MATSim simulation, agents would choose between different modes, routes, etc.
- But one can also simply run one iteration to track the movements of the vehicles
MATSim
- The main output of MATSim is the events file that contains everything that happens during the simulation
- By aggregating this file, we can obtain any relevant information
MATSim
- The main output of MATSim is the events file that contains everything that happens during the simulation
- By aggregating this file, we can obtain any relevant information
MATSim
- The main output of MATSim is the events file that contains everything that happens during the simulation
- By aggregating this file, we can obtain any relevant information
- But there are pre-aggregated data sets, such as the trip table
3.3 On-demand service simulation
On-demand service simulation
- Coming soon
- Initially from the field of operations research
- Traveling salesman problem
- Vehicle routing problem
- Dial-a-ride problem
- Hard to solve! Limited ability for large instances!
- Heuristic approaches needed (1) in reality and (2) especially in simulation: Want to apply approaches that are realistic enough to give some insight, but quick enough to perform simulations (cf. human taxi drivers)
- Simplest approach: Best-response assignment
- Vehicle perspective, passenger perspective, combined
- Adaptive switching
- Bipartite matching approach, many others exist (see recent advances in reinforcement learning)
- Relocation: Strongly linked problem, where to send empty vehicles?
- Show comparison results from Zurich study
Questions?
sebastian.horl@irt-systemx.fr
Large-scale models and simulation methods
By Sebastian Hörl
Large-scale models and simulation methods
Université Gustave Eiffel, Spring 2024
- 420