Photometric Redshifts (in HELP)
Kenneth Duncan
Cosmic Census - Oct 2017
Leiden Observatory
e.g. CANDELS UDS: Galametz et al. (2013)
You have your nice new multi-wavelength catalog, now what...
Recipe 1:
Template fitting photo-z estimates
Step 1: The code
EAZY
Brammer et al. (2008)
LePhare
Arnouts et al. (1999)
Ilbert et al. (2006)
PhotoZ
Bender et al. (2001)
Hyper-Z
Bolzonella et al. (2000)
ZEBRA
Feldmann et al. (2006)
BPZ
Benitez (2000)
Step 1: The code
Total citations: ~2800
Then
Now
Step 2: The Templates
Step 3: zeropoint offsets and additional smoothing errors
Additional rest-frame errors
Corrections to the observed zeropoints
Brammer et al. (2008)
Dust
AGB Stars?
PAH/Dust emission/AGN?
Step 4: priors (optional)
Brammer et al. (2008)
Benitez (2000)
Magnitude
Spectral type
Recipe 2:
Training based photo-z estimates
(aka machine learning)
Aside: Motivations for ML-based Photo-z's
Euclid
LSST
Aside: Motivations for training (ML) based Photo-z's
1. Speed
Euclid: ~1.5 billion galaxies
LSST: ~10 billion galaxies
Estimated time to run EAZY on all sources (on a desktop machine):
~2+ years (Euclid)
~14+ years (LSST)
Motivations for training (ML) based Photo-z's
2. Improvements in accuracy
Sanchez et al. (2014)
Weak Lensing requirements:
Scatter
Bias
Step 1: Select your training sample
i.e. a representative subset of your sample with spectroscopic redshifts
Step 2: Pick your favourite regression/classification algorithm
Neural Networks
Self-organizing Maps (SOMs)
Deep learning
Support Vector Machines (SVM)
Naive Bayes
Gaussian Processes
Generalized Linear Models
Bayesian Network
k-Nearest Neighbour
Boosted Decision Trees
Randomised Forests
Relevance vector machines
Radial basis function networks
Normalised inner product nearest neighbour
Directional neighbourhood fitting
Voronoi tesselation density estimator
Non-conditional density estimation
Neural Networks
Self-organizing Maps (SOMs)
Deep learning
Support Vector Machines (SVM)
Naive Bayes
Gaussian Processes
Generalized Linear Models
Bayesian Network
k-Nearest Neighbour
Boosted Decision Trees
Randomised Forests
Relevance vector machines
Radial basis function networks
Normalised inner product nearest neighbour
Directional neighbourhood fitting
Voronoi tesselation density estimator
Non-conditional density estimation
Step 3: Train your regression/classification algorithm
Step 4: Apply to your science sample
magic happens somewhere here
Pros and Cons of ML Photo-z's
Pro:
- Fast and scalable
- Entirely empirical:
no concern about template choice
photometry systematics less of a problem - Simple to include extra information:
properties such as size and morphology can help break degeneracies
Con:
- Entirely dependent on spectroscopic training sample
- Struggle more with inhomogeneous datasets (e.g. missing filters)
- Difficult to physically interpret solutions - e.g. rest-frame colours
Final step: (For all photo-z methods)
Fraction of spectroscopic redshifts within given confidence interval
Dahlen et al. (2012)
!
7/11 submitted photo-z estimates significantly overconfident for 1-sigma errors
Calibrating redshift pdfs
Calibrating redshift pdfs
Wittman et al. (2016)
See also Bordoloi et al. (2010)
Under-confident
Over-confident
Improving photo-z estimates even more...
the wisdom of crowds
Combine multiple photo-z estimates
Dahlen et al. (2012)
= Median of all photo-z estimates
= Median of best 5 photo-z estimates
See also Carraso Kind & Brunner (2014)
Also works for diff. templates with the same code
Photometric redshift strategy for HELP
Gory details presented in...
Duncan et al. (2017a, 1709.09183)
and Duncan et al. (2017b, in prep)
Overall strategy for HELP
-
Run photo-z estimates using 3 different template libraries:
- eazy templates (stellar only)
- Salvato et al. XMM-COSMOS library (stellar and AGN/QSO)
- Michael Brown’s ‘Atlas of Galaxy SEDs’ (stellar and AGN)
-
Separate galaxies and AGN dominated sources where possible (optical/IR/X-ray selection) -> optimise magnitude priors and calibration procedure for each set
- Combine individual estimates to produce consensus P(z) using hierarchical Bayesian Combination
Overall strategy for HELP
-
Run photo-z estimates using 3 different template libraries:
- eazy templates (stellar only)
- Salvato et al. XMM-COSMOS library (stellar and AGN/QSO)
- Michael Brown’s ‘Atlas of Galaxy SEDs’ (stellar and AGN)
a) Zeropoint offset calculated separately for each individual template set
b) Lazy parallelisation of eazy, field split into many chunks and run in parallel.
Overall strategy for HELP
2. Separate galaxies and AGN dominated spectra where possible - optimise magnitude priors and calibration procedure for each set
Overall strategy for HELP
-
Run photo-z estimates using 3 different template libraries:
- eazy templates (stellar only)
- Salvato et al. XMM-COSMOS library (stellar and AGN/QSO)
- Michael Brown’s ‘Atlas of Galaxy SEDs’ (stellar and AGN)
-
Separate galaxies and AGN dominated sources where possible (optical/IR/X-ray selection) -> optimise magnitude priors and calibration procedure for each set
- Combine individual estimates to produce consensus P(z) using hierarchical Bayesian Combination
What HELP will produce
What HELP will produce
1. Photometric redshift catalogs, including:
- Primary and secondary solutions
- Calibrated uncertainty estimates
What HELP will produce
1. Photometric redshift catalogs, including:
- Primary and secondary solutions
- Calibrated uncertainty estimates
- A range of corresponding diagnostic plots for each field
What HELP will produce
2. Selection functions:
For a source with a given set of photometric properties...
a) what is the probability of a photo-z estimate existing in the HELP database
b) what is the probability of a reliable* photo-z estimate existing in the HELP database
*a very flexible definition
Where HELP can help in future...
Compilation and homogenisation of datasets make machine learning estimates a more viable option for some fields
Incorporating targeted ML estimates can dramatically improve estimates for AGN
Summary
Producing consistent high quality photo-zs for 1300sq.deg of the sky is a challenge...but manageable
The heterogeneous nature of the datasets makes template fitting the only feasible starting point
Bayesian combination of multiple redshift estimates provides near optimal solutions across multiple fields/source types
Calibrate your photo-z errors!
Photometric redshifts in HELP
By Kenneth Duncan
Photometric redshifts in HELP
Review of photometric redshifts past, present and future. For the Lorentz Workshop Jun 20th-24th
- 490