federica bianco PRO
astro | data science | data for good
University of Delaware
Department of Physics and Astronomy
Biden School of Public Policy and Administration
Data Science Institute
Rubin Legacy Survey of Space and Time
Deputy Project Scientist, Rubin Construction
Interim Head of Science, Rubin Operations
Applications of and opportunities for AI in the new era of time-domain astronomy
federica b. bianco
she/her
this slide deck is live at https://slides.com/federicabianco/AMLAW
The best way to view the slides is on the web (to see videos and animations). A flat (PDF) version of this deck would be largely diminished
Applications of and opportunities for AI in the new era of time-domain astronomy
University of Delaware
Department of Physics and Astronomy
Biden School of Public Policy and Administration
Data Science Institute
Rubin Legacy Survey of Space and Time
Deputy Project Scientist, Rubin Construction
Interim Head of Science, Rubin Operations
federica b. bianco
she/her
explosions in the sky
how we study SNe
HELL YEAH!
2025
edge computing
Do we want more data??
SKA
(2025)
edge computing
SKA
(2025)
edge computing
SKA
(2025)
edge computing
SKA
(2025)
17B stars Ivezic+19
~10 million QSO Mary Loli+21
~50k Tidal Disruption Events Brickman+ 2020
~10k SuperLuminous Supernovae Villar+ 2018
edge computing
SKA
(2025)
17B stars Ivezic+19
~10 million QSO Mary Loli+21
~50k Tidal Disruption Events Brickman+ 2020
~10k SuperLuminous Supernovae Villar+ 2018
~200 quadruply-lensed quasars Minghao+19 Ardense+24
edge computing
SKA
(2025)
17B stars Ivezic+19
~10 million QSO Mary Loli+21
~50k Tidal Disruption Events Brickman+ 2020
~10k SuperLuminous Supernovae Villar+ 2018
~200 quadruply-lensed quasars Minghao+19 Ardense+24
~50 kilonovae Setzer+19, Andreoni+19 (+ ToO)
edge computing
SKA
(2025)
17B stars Ivezic+19
~10 million QSO Mary Loli+21
~50k Tidal Disruption Events Brickman+ 2020
~10k SuperLuminous Supernovae Villar+ 2018
~200 quadruply-lensed quasars Minghao+19 Ardense+24
~50 kilonovae Setzer+19, Andreoni+19 (+ ToO)
> 10 Interstellar Objects (. ?)
edge computing
SKA
(2025)
17B stars Ivezic+19
~10 million QSO Mary Loli+21
~50k Tidal Disruption Events Brickman+ 2020
~10k SuperLuminous Supernovae Villar+ 2018
~200 quadruply-lensed quasars Minghao+19, Ardense+24
~50 kilonovae Setzer+19, Andreoni+19 (+ ToO)
> 10 Interstellar Objects (. ?)
True Novelties!
edge computing
SKA
(2025)
17B stars Ivezic+19
~10 million QSO Mary Loli+21
~50k Tidal Disruption Events Brickman+ 2020
~10k SuperLuminous Supernovae Villar+ 2018
~200 quadruply-lensed quasars Minghao+19, Ardense+24
~50 kilonovae Setzer+19, Andreoni+19 (+ ToO)
> 10 Interstellar Objects (. ?)
True Novelties!
edge computing
well... it depends
2025
(2026)
edge computing
Is the data gonna also be better?
Grover+ 2021 https://doi.org/10.1093/mnras/stab1935
thank you Yogesh!
Sarkar+ 2023, https://doi.org/10.1093/mnras/stac3096
thank you Arumina!
visualizatoin and concept credit: Alex Razim
Kaicheng Zhang et al 2016 ApJ 820 67
https://plasticc.org/data-release
deSoto+2024
Boone 2017
7% of LSST data
Boone 2017
7% of LSST data
The rest
Data: PLAsTiCC
Model: salt2 (Guy+07) implemented with SNCOSMO (Barbary+2012)
transient data AI ready (see Alex's talk)
Rohan Pattnaik+ 2025
Dhanpal+2022
thank you Shravan
thank you Rohan
Willimamson+23
Barna+ 2017
Howell 2011
time-domain spectra are just painful
Rubin will see ~1000 SN every night!
Credit: Alex Gagliano IAIFI fellow MIT/CfA
Site: Cerro Pachon, Chile
Funding: US NSF + DOE
To be transformational simultaneously in the four scientific areas Rubin needs:
1) a large telescope mirror to be sensitive - 8m (6.7m) deep survey
2) a large field-of-view for sky-scanning speed - 10 deg2 wide survey
3) high spatial resolution, high quality images - 0.2''/pixels exquisite image quality
4) process images in realtime and offline to produce live alerts and catalogs of all 37B objects
massive time domain dataset
Rubin Observatory Status
September 2016
5 / 2019
May 2022 - Telescope Mount Assembly
12/2022 TMA in action
weight 2e5 kg, max slew rate 0.2 rad/s
Most of the weight in a 10m disk
Angular momentum
3024 science raft amplifier channels
Camera and Cryostat integration completed at SLAC in May 2022,
Shutter and filter auto-changer integrated into camera body
LSSTCam undergoing final stages of testing at SLAC
July 2024 ComCam installed on the telescope after M1M2 installation - Comcam is a 144Mpix version of LSSTCam
artist (me) impression of the first image taken by ComCam
Is the data gonna also be better?
magnitude limit single image r~24
magnitude limit 10 year stacks r~27
spatial resolution 0.2'' (seeing limited)
photometric precision 5mmag
photometric accuracy 10mmag
cadence.... that's a long story
At this level of precision,everything is variable, everything is blended, everything is moving.
SDSS
LSST-like HSC composite
Field of View' Image resolution' DDFs' Standard visit' Photometric precision' Photometric accuracy' Astrometric precision' Astrometric accuracy' |
9.6 sq deg 0.2'' (seeing limited) 5 DDF 30 sec 5 mmag 10 mmag 10 mas 50 mas |
' requirement: ls.st/srd
SDSS 2x4 arcmin sq griz
MYSUC (Gawiser 2014) 1 mag shallower than LSST coadds
u,g,r,i,z,y | |
---|---|
Photometric filters' saturation limit' # visits* mag single image* mag coadd* Nominal cadence |
u, g, r, i, z, y ~15, 16, 16, 16, 15, 14 53, 70, 185, 192, 168, 165 23.34, 23.2, 24.05, 23.55 22.03 25.4, 26.9, 27.0, 26.5, 25.8, 24.9 2-3 visits per night |
At this level of precision,everything is variable, everything is blended, everything is moving.
' requirement: ls.st/srd
Discovery
The lifecycle of a time-domain project is complex and fertile with opportunities for AI solutions
Discovery Engine
10M alerts/night
Community Brokers
target observation managers
BABAMUL
Graphic credit: Francisco Förster Burón
Augmentation
and
distribution
Discovery
The lifecycle of a time-domain project is complex and fertile with opportunities for AI solutions
Graphic credit: Francisco Förster Burón
Augmentation
and
distribution
Discovery
The lifecycle of a time-domain project is complex and fertile with opportunities for AI solutions
Graphic credit: Francisco Förster Burón
Augmentation
and
distribution
Discovery
The lifecycle of a time-domain project is complex and fertile with opportunities for AI solutions
Graphic credit: Francisco Förster Burón
Augmentation
and
distribution
Discovery
The lifecycle of a time-domain project is complex and fertile with opportunities for AI solutions
Discovery
Distribution
Classification
Data Integration and Follow up
Ensamble Inference
Prediction
Discovery of Novelties
(A.K.A science!)
Discovery
Distribution
Classification
Data Integration and Follow up
Ensamble Inference
Prediction
Discovery of Novelties
(A.K.A science!)
in <60 seconds:
Difference Image Analysis
in <60 seconds:
Difference Image Analysis
Can we replace DIA with ANN?
TANSINET: Sedhagat + Mahabal 2017
in 60 seconds:
Difference Image Analysis + Bogus rejection
feature extraction + Random Forest
AUTOSCAN: Goldstein+ 2017
search
template
difference
-
=
96% accurate
Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow
search
template
difference
-
=
92% accurate
Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow
Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow
WORKING WITH RUBIN AP TEAM TO DEVELOP THE ML-RELIABILITY SCORE OF RUBIN ALERTS
What is the network learning?
What can we learn from the AI?
search
template
difference
template
search
Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow
What is the network learning?
What can we learn from the AI?
Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow
Interpretable AI
Robust AI
Anomaly detection
Distribution
Classification
Data Integration and Follow up
Ensamble Inference
Prediction
Discovery of Novelties
(A.K.A science!)
Discovery
F. Förster et al 2021 AJ 161 242
AI tasks
Distribution
Classification
Data Integration and Follow up
Ensamble Inference
Prediction
Discovery of Novelties
(A.K.A science!)
Discovery
Kepler satellite EB
LSST (simulated) EB
Classification from sparse data: Lightcurves
The PLAsTiCC challenge winnre, Kyle Boone was a grad student at Berkeley, and did not sue a Neural Network!
He won $2,000
Lochner et al 2018
Text
Classification from sparse data: Lightcurves
Lochner et al 2018
Classification from sparse data: Lightcurves
Text
Classification from sparse data: Lightcurves
without redshift
with redshift
Classification from sparse data: Lightcurves
without redshift
with redshift
Classification from sparse data: Lightcurves
without redshift
with redshift
Classification from sparse data: Lightcurves
Methodological issues with these approaches
CNNs are not designed to ingest uncertainties. Passing them as an image layer "works" but it is not clear why since the convolution on the flux and error space are averaged after the first layer
Methodological issues with these approaches
Gaussian processes work by imposing a kernel that represents the covariance in the data (how data depend on time or time/wavelength). Imposing the same kernel for different time-domain phenomena is principally incorrect
=> bias toward known classes!
Methodological issues with these approaches
Gaussian processes work by imposing a kernel that represents the covariance in the data (how data depend on time or time/wavelength). Imposing the same kernel for different time-domain phenomena is principally incorrect
=> bias toward known classes!
Neural processes replace the imposed kernel with a learned one - ask Siddharth Chaini!
Dr. Somayeh Khakpash
LSSTC Catalyst Fellow, Rutgers
Rare classes will become common, but how do we know what we are looking at and classify different objects for sample studies?
Data-Driven Photometric Templates for stripped SESN
on the job market!
Khakpash et al. 2024 ApJS https://arxiv.org/pdf/2405.01672
FASTlab Flash highlight
Methodological issues with these approaches
Attetion requires positional encoding
Hlozek et al, 2020
DATA CURATION IS THE BOTTLE NECK
models contributed by the community were in
- different format (spectra, lightcurves, theoretical, data-driven)
- the people that contributed the models were included in 1 paper at best
- incompleteness
- systematics
- imbalance
khakpash+ 2024 showed that the models were biased for SN Ibc
AVOCADO, SCONE, all these models are trained on a biased dataset and are being currently used for classification
Ibc data-driven templates vs PLAsTiCC
khakpash+ 2024 showed that the models were biased for SN Ibc
AVOCADO, SCONE, all these models are trained on a biased dataset and are being currently used for classification
Ibc data-driven templates vs PLAsTiCC
Ic templates vs ELAsTiCC
khakpash+ 2024 showed that the models were biased for SN Ibc
AVOCADO, SCONE, all these models are trained on a biased dataset and are being currently used for classification
Ibc data-driven templates vs PLAsTiCC
survey optimization
multimodal data analysis
and pixel to science
why not images too?
LSST
data products
Time
Domain
Science
Static
Science
Alerts based
Catalog based
Deep stack
based
Deep stack
based
data right holders only
Rubin In-Kind Contribution Program
world public!
10Million alerts per night!
LSST survey strategy optimization
Exploring the Transient and Variable Optical Sky
Exploring the Transient and Variable Optical Sky
Exploring the Transient and Variable Optical Sky
Exploring the Transient and Variable Optical Sky
Exploring the Transient and Variable Optical Sky
Exploring the Transient and Variable Optical Sky
LSST Science Book (2009)
Rubin has involved the community to an unprecedented level in survey design this is a uniquely "democratic" process!
2024
2024
the butterfly effect
NGC 4565 is an edge-on spiral galaxy about 30 to 50 million light-years away. The faculty at the IUCAA used a AI model (emulator) to predict the hidden physical parameters of the Galaxy wrongfully estimating the DM content of NCG 4565 and claimed a novel process for Galaxy formation should be taken under consideration.
the butterfly effect
NGC 4565 is an edge-on spiral galaxy about 30 to 50 million light-years away. The faculty at the IUCAA used a AI model (emulator) to predict the hidden physical parameters of the Galaxy wrongfully estimating the DM content of NCG 4565 and claimed a novel process for Galaxy formation should be taken under consideration.
Unfortunately, this was the result of a model hallucination.
the butterfly effect
NGC 4565 is an edge-on spiral galaxy about 30 to 50 million light-years away. The faculty at the IUCAA used a AI model (emulator) to predict the hidden physical parameters of the Galaxy wrongfully estimating the DM content of NCG 4565 and claimed a novel process for Galaxy formation should be taken under consideration.
Unfortunately, this was the result of a model hallucination.
The galaxy was featured in many social media posts gaining rapid notoriety, but upon retraction it was canceled. The galaxy is suing IUCAA claiming emotional damage and loss of revenue
the butterfly effect
the butterfly effect
We use astrophyiscs as a neutral and safe sandbox to learn how to develop and apply powerful tool.
Deploying these tools in the real worlds can do harm.
Ethics of AI is essential training that all data scientists shoudl receive.
Why does this AI model whitens Obama face?
Simple answer: the data is biased. The algorithm is fed more images of white people
But really, would the opposite have been acceptable? The bias is in society
Why does this AI model whitens Obama face?
Simple answer: the data is biased. The algorithm is fed more images of white people
But really, would the opposite have been acceptable? The bias is in society
Why does this AI model whitens Obama face?
Simple answer: the data is biased. The algorithm is fed more images of white people
Joy Boulamwini
thank you!
University of Delaware
Department of Physics and Astronomy
Biden School of Public Policy and Administration
Data Science Institute
federica bianco
Rubin Construction
Deputy Project Scientist
fbianco@udel.edu
By federica bianco
Fast Transients Opportunities with Rubin LSST