Francois Lanusse on behalf of the Transatlantic Dream Team:
Noe Dia (Ciela), Sacha Guerrini (CosmoStat), Wassim Kabalan (CosmoStat/APC), Francois Lanusse (CosmoStat),
Julia Linhart (NYU), Laurence Perreault-Levasseur (Ciela), Benjamin Remy (SkAI), Sammy Sharief (Ciela),
Andreas Tersenov (CosmoStat), Justine Zeghal (Ciela)
Francois Lanusse
Sacha Guerrini
Benjamin Remy
Justine Zeghal
Julia Linhart
Sammy Sharief
Laurence
Perreault-Levasseur
Andreas Tersenov
Wassim Kabalan
Noé Dia
3 different cosmologies,
at same systematics index
Identifying the real challenge: very limited training data
3 different cosmologies,
at same systematics index
1st Principle: Choose a small convolutional architecture, don't spend a ton of time optimizing it.
2nd Principle: Don't do anything clever for the loss function, directly optimize the challenge metric.
3rd Principle: Think of the training set as finetuning/calibration data, pretrain the network on inexpensive emulated maps.
EfficientNetV2 (Tan & Le 2021)
Early layers:
Late layers:
Tan & Le, EfficientNetV2: Smaller Models and Faster Training, ICML 2021
With only 20K samples, we overfit quickly. We therefore augmented the dataset with additional simulations.
Maximizing the challenge score:
Reshaping: remove most of the masked pixels
Data augmentation: rotations, flips, rollings
EfficientNetV2
Backbone
MLP
head
Embedding
Reshaping
EfficientNetV2
Backbone
MLP
head
Embedding
EfficientNetV2
Backbone
MLP
head
Embedding
EfficientNetV2 (Tan & Le 2021)
Early layers:
Late layers:
EfficientNetV2
Backbone
MLP
head
Embedding
Maximizing the score:
EfficientNetV2
Backbone
MLP
head
Embedding
Maximizing the score:
Reshaping: remove most of the masked pixels
Reshaping
EfficientNetV2
Backbone
MLP
head
Embedding
Maximizing the score:
Reshaping: remove most of the masked pixels
Reshaping
Augmentation
Data augmentation: rotations, flips, rollings
EfficientNetV2
Backbone
MLP
head
Embedding
Maximizing the score:
With only 20K samples, we overfit quickly. We therefore augmented the dataset with additional simulations.
Reshaping: remove most of the masked pixels
Data augmentation: rotations, flips, rollings
Augmentation
Reshaping
Objective: Create cheap simulations close to the challenge's dataset.
Why model the matter field as LogNormal ?
Kappa map obtained from the matter field using the GLASS package (Tessore et al., 2023).
→ Kappa map on the sphere
LogNormal Convergence (patch)
Objective: Create cheap simulations close to the challenge's dataset.
Why model the matter field as LogNormal ?
Kappa map obtained from the matter field using the GLASS package (Tessore et al., 2023).
→ Kappa map on the sphere
How good are the LogNormal simulations?
Comparison at same cosmology.
Challenge map
Power spectrum comparison
LogNormal map
Higher-order statistics
Data distribution comparison
LogNormal simulations are still very different.
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Challenge like
LogNormal
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
✅
❌
Challenge like
LogNormal
Challenge like
LogNormal
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
Where provides the pairs such that it minimizes the transport cost
Loss function:
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
Dataset 1
Dataset 2
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
Optimal
Transport Plan
Dataset 1
Dataset 2
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
Optimal
Transport Plan
Dataset 1
Dataset 2
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
LogNormal simulations are still very different.
Objective: learning a pixel-level correction.
Problem: we do not have the initial conditions of the challenge simulations.
Solution: Conditional Optimal Transport Flow-Matching (Kerrigan et al., 2024).
LogNormal
Emulated
Challenge simulation
VS
How good are the emulated simulations?
Comparison at same cosmology.
Challenge map
Power spectrum comparison
Emulated map
Higher-order statistics
Data distribution comparison
Validation score during pre-training
LR finetuning + early stopping
pre-training with OT maps
ensembling
pre-training with LogNormal maps
EfficientNetV2
| Score |
|---|
| 11.3856 |
| 11.5214 |
| 11.527 |
| 11.6468 |
| 11.6612 |
Summary of our progress
Conclusion: Even with a constrained simulation budget, you can get surprisingly strong end-to-end SBI performance by (i) exploiting cheap surrogate simulations for pretraining, and (ii) pushing “standard” supervised training (augmentations, regularization, careful design) much harder than is typical.
=> This suggests that practical SBI for real surveys may require fewer expensive simulations than one would naively assume.