federica bianco
astro | data science | data for good
Federica B. Bianco
University of Delaware
Physics and Astronomy
Biden School of Public Policy and Administration
Data Science Institute
Vera C. Rubin Observatory
Deputy Project Scientist - Construction
Interim Head of Science - Operations
Riley Clark Every Datapoint Counts: Stellar Flares as a Case Study of Atmosphere-Aided Transients in the Rubin LSST Era
Somayeh Khakpash Automatic Generation of Magnification Maps for Lensed Quasars and Supernovae Using Deep Learning
Tatiana Acero-Cuellar Forward modeling of dust and transients - a method for the generation of synthetic Light Echoes
Willow Fox Fortino
Shar Daniels Exploring the Sub-Second Transient Sky with Continuous-Readout Images and Neural Networks (poster)
Siddharth N Chaini Light curve classification using distance metrics (poster)
|
motivation
A two-way street
AI enables discoveries in astronomy;
astronomy will provide high-impact research problems
and large, complex, and open datasets that
will result in new AI breakthroughs
Historical perspective
1/6
Galileo Galilei 1610
Following: Djorgovski
https://events.asiaa.sinica.edu.tw/school/20170904/talk/djorgovski1.pdf
Experiment driven
Enistein 1916
Theory driven | Falsifiability
Experiment driven
Ulam 1947
Theory driven | Falsifiability
Experiment driven
Simulations | Probabilistic inference | Computation
http://www-star.st-and.ac.uk/~kw25/teaching/mcrt/MC_history_3.pdf
Theory driven | Falsifiability
Experiment driven
Simulations | Probabilistic inference | Computation
the 1947-today
the 2000s-today
Theory driven | Falsifiability
Experiment driven
Simulations | Probabilistic inference | Computation
Data | Survey astronomy | Computation | pattern discovery
from commissioniong observation
to scanning the sky and giving away the data (open science model!)
New stress on the infrastructure
Rubin LSST starting in 2025 expects:
~1000 images per night
10M alerts per night (5sigma changes)
∼200 quadruply-lensed quasars Minghao+19
~50 kilonovae Setzer+19, Andreoni+19 (+ ToO)
>10 interstellar objects
~10k SuperLuminous Supernovae Villar+ 2018
~ 50k Tidal Disruption Events Brickman+ 2020
~10 million QSO Mary Loli+21
Gartner report 2001
V1: Volume
Number of bites
Number of pixels
Number of astrophysical objects in a data x number of featured measured
V2: Variety
Diverse science return from the same dataset
e.g. cosmology+stellar physics
cosmo
Multiwavelength
Multimessenger
Images and spectra
V4: Veracity
This V will refer to both data quality and availability (added in 2012)
Inclusion of uncertainty in inference and simulations
V3: Velocity
Real time analysis, edge computing, data transfer
Gartner report 2001
Gartner report 2001
Exquisite image quality
all over the sky
over and over again
SDSS image circa 2000
HSC image circa 2018
when you look at the sky at this resolution and this depth...
everything is blended and everything is changing
Gartner report 2001
Gartner report 2001
Text
log number of Megapixels
1.5 2.0 2.5 3.0 3.5
Etendue: area x FoV
Exquisite image quality
all over the sky
over and over again
SDSS image circa 2000
HSC image circa 2018
when you look at the sky at this resolution and this depth...
everything is blended and everything is changing
Gartner report 2001
Gartner report 2001
Text
3.2 Gpix Rubin camera
learning by example
(supervised learning)
pattern discovery
(unsupervised learning)
The DOE LSST camera
at the Vera C. Rubin Observatory
has 3.2 gigapixels
to scan the whole sky
at high resolution
every few nights
400 4K HD TVs to display a singla LSST Camera Image
"Data that stresses the infrastructure"
John R. Mashey Chief Scientist, SGI, mid-1990s
"Data that does not fit in memory"
Big Data in astronomy papers
(source: ADS)
- Big Data
- Data Science (x30) (x30)
- Artificial Intelligence (x10)
-
1996 2006 2016
occurrence of term in Google-books corpus https://books.google.com/ngrams
- Big Data
- Data Science (x30) (x30)
- Artificial Intelligence (x10)
-
occurrence of term in Google-books corpus https://books.google.com/ngrams
Exquisite image quality
all over the sky
over and over again
SDSS image circa 2000
HSC image circa 2018
when you look at the sky at this resolution and this depth...
everything is blended and everything is changing
Gartner report 2001
Gartner report 2001
Text
Astronomical phenomena happen at all time scales and require federated observations to collect sufficient data to unravel the physics
cepheid
Astronomy’s Discovery Chain
Community Brokers
target observation managers
when did the first Neural Network in astronomy review came out?
number of arXiv:astro-ph submissions with abstracts containing one or more of the strings: ‘machine learning’, ‘ML’, ‘artificial intelligence’, ‘AI’, ‘deep learning’ or ‘neural network’.
Extreme levels of automation
2/6
Discovery Engine
10M alerts/night
Community Brokers
target observation managers
BABAMUL
F. Förster et al 2021 AJ 161 242
from commissioniong observation
to scanning the sky and giving away the data (open science model!)
Classification and rare event detection
3/6
High Energy Phsyics leading the way
1988
High Energy Phsyics leading the way
from slides by Kyle Cramer
Simulation Based Inference
Physical parameters (particles)
Simulate hundreds of millions of particles interactions
Calculate the P(data | physics) in all observational spaces**
Statistical models of measurements performed by independent teams of scientists are combined a posteriori without loss of detail
=> the discovery of Higgs Boson
High Energy Phsyics leading the way
from slides by Kyle Cramer
Simulation Based Inference
Physical parameters (particles)
Simulate hundreds of millions of particles interactions
Calculate the P(data | physics) in all observational spaces**
Statistical models of measurements performed by independent teams of scientists are combined a posteriori without loss of detail
=> the discovery of Higgs Boson
now developing NN to model summary statistics from high dimensional feature spaces
High Energy Phsyics leading the way
The rise of Bayesian Deep Learning
Astronomical anomalies and rare classes
Forcing Serendipity
Sparse, unevenly sampled Kepler time series
2D T-SNE projection of feature space
Weirdness score
Astronomical anomalies and rare classes
Forcing Serendipity
Sparse, unevenly sampled Kepler time series
2D T-SNE projection of feature space
Weirdness score
and yet...
discovered by eye
Generative AI
4/6
SN classification from Spectra
Spectra reveal progenitors of stellar explosions
but spectra are expensive to take
Lochner et al 2018
Text
Classification from sparse data: Lightcurves
If you are trying to simulate the whole Universe.... you are going to be computationally limited
Using AI to speed up simulations
by learning scale relations
Transfer Learning
from simulation to real data
Physics Informed Models
(and go full circle back to Galaxy morphology classification with few-shot learning!)
Physics informed AI
5/6
Application regime:
-infinity - 1950's
theory driven: little data, mostly theory, falsifiability and all that...
-1980's - today
data driven: lots of data, drop theory and use associations, black-box modles
Application regime:
-infinity - 1950's
theory driven: little data, mostly theory, falsifiability and all that...
-1980's - today
data driven: lots of data, drop theory and use associations, black-box modles
lots of data yet not enough for entirely automated decision making
complex theory that cannot be solved analytically
combine it with some theory
Non Linear PDEs are hard to solve!
via a modified loss function that includes residuals of the prediction and residual of the PDE
Federica B. Bianco
University of Delaware
Physics and Astronomy
Biden School of Public Policy and Administration
Data Science Institute
Vera C. Rubin Observatory
Deputy Project Scientist - Construction
Interim Head of Science - Operations
By federica bianco