ILLUSTRIS TNG
C. Cuesta, C. Becker,
S. Bose, C. Arnold and C. Baugh
Galaxy-Halo Connection
Hydro simulations
Empirical
models
&
?
Full Physics
Dark Matter Only
1) Find 50 most bound DM particles
Halo #1
Halo #23
Halo #1
2) Find DMO halo with at least 50% of these particles
97% of halos matched
Bijective!
2) Find DMO halo with at least 50% of these particles
Clustering as a function of halo mass: HOD
Full Physics
Dark Matter Only
Shape
Dynamics
Temporal Evolution
Environment
Mass
Radius
Concentration
...
Velocity dispersion
Vmax
Velocity anisotropy
Spin
....
Formation time
Nmergers
...
Mass in torus
around halo
Full Physics
Dark Matter Only
Halo #1
Halo #23
Halo #1
Learn from Illustris
Decision Trees
Be greedy: try all splits
YES
NO
How good is a given split?
Loss function
Mean Squared Error
Regularisation
Maximum depth
Boosting
Tree depth
Error
Decision Tree Regression
Target
Data
Underfitting
Overfitting
Optimal
Bagging
Tree Depth
Fight overfitting: Bagging
Bootstrap 1
Bootstrap 2
Bootstrap 3
Decision Tree 1
Decision Tree 2
Decision Tree 3
Average
Extras!
Out of bag errors (no need for validation set)
Can do the same with features (improve over greediness)
Fight bias: Boosting
Focus on difficult samples: Gradient descent in function space!
Error
Previous prediction
New prediction
Bagging + Boosting =
LightGradientBoostedMachine
(LGBM)
Model performance
Simplify the model
Clustering and environment
With
Without
Clustering and environment
Conclusions
- Trained an ensemble of decision trees to model the relation between stellar mass and dark matter halo properties.
- It reproduces the two point correlation function of central galaxies in Illustris TNG, as opposed to the baseline HOD model.
- The model can be used to populate mock catalogues, based on the halo's:
- #TODO Satellite galaxies?
- Bias in environment have a strong effect in clustering.
What can we learn from the machine?
i) Decision Trees default: Sum of impurity gains (MSE reduction at a given split) per feature.
Dynamical range, number of splits
Correlations
ii) Difference in MSE after retraining a model without the feature of interest.
Uncorrelated features
Should I ask a question after this talk?
Is it almost lunch time?
Do you want your colleagues to hate you?
Ask!
Do I care?
Don't ask
Is it about magnetic fields?
Ask!
Ask!
Don't ask
YES
NO
NO
NO
YES
YES
NO
NO
YES
Don't ask
A tree grows in Illustris TNG: the galaxy-halo connection learnedby boosted decision trees
By carol cuesta
A tree grows in Illustris TNG: the galaxy-halo connection learnedby boosted decision trees
- 672