Opportunity

the era of AI

experiment driven science -∞:1900

theory driven science 1900-1950

data driven science 1990-2010

the fourth paradigm - Jim Gray, 2009

computationally driven science 1950-1990

experiment driven science -∞:1900

theory driven science 1900-1950

data driven science 1990-2010

the fourth paradigm - Jim Gray, 2009

computationally driven science 1950-1990

AI driven science? 2010...

Input

x

y

output

data

prediction

physics

Machine Learning

Input

x

y

output

function

f(x)

Machine Learning

Input

x

y

output

f(x)

f(x) = mx + b

b

m

m: slope

b: intercept

Machine Learning

Input

x

y

output

f(x)

f(x) = mx + b

b

m

m: slope

b: intercept

parameters

x

y

learn

goal: find the right m and b that turn x into y

Machine Learning

https://symposia.obs.carnegiescience.edu/series/symposium2/ms/freedman.ps.gz

Tree models

(at the basis of Random Forest

Gradient Boosted Trees)

Machine Learning

Galaxy Zoo

p(class)

extracted

features vector

p(class)

pixel values tensor

f(x)

Frank Rosenblatt, 1958

The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence.

The embryo - the Weather Buerau's $2,000,000 "704" computer - learned to differentiate between left and right after 50 attempts in the Navy demonstration

NEW NAVY DEVICE LEARNS BY DOING; Psychologist Shows Embryo of Computer Designed to Read and Grow Wiser

July 8, 1958

GPT-3

175 Billion Parameters

3,640 PetaFLOPs days

Kaplan+ 2020

x

y

A Neural Network is a kind of function that maps input to output

Input

output

hidden layers

latent space

x

y

A Neural Network is a kind of function that maps input to output

Input

output

hidden layers

latent space

Opportunity

big data in astronomy

HELL YEAH!

2025

edge computing

Will we get more data???

SKA

(2025)

edge computing

Rubin LSST Transients by the numbers

17B stars (x10) Ivezic+19

~10 million QSO (x10) Mary Loli+21

~50k Tidal Disruption Events (from ~150) Brickman+ 2020

~10k SuperLuminous Supernovae (from ~200)Villar+ 2018

~400 strongly lensed SN Ia (from 10) Ardense+24

~50 kilonovae (from 2) Setzer+19, Andreoni+19 (+ ToO)

> 10 Interstellar Objects fom 2.... ?)

edge computing

Rubin LSST Transients by the numbers

17B stars (x10) Ivezic+19

~10 million QSO (x10) Mary Loli+21

~50k Tidal Disruption Events (from ~150) Brickman+ 2020

~10k SuperLuminous Supernovae (from ~200)

~400 strongly lensed SN Ia (from 10) Ardense+24

~50 kilonovae (from 2) Setzer+19, Andreoni+19 (+ ToO)

> 10 Interstellar Objects fom 2.... ?)

edge computing

Rubin LSST Transients by the numbers

17B stars (x10) Ivezic+19

~10 million QSO (x10) Mary Loli+21

~50k Tidal Disruption Events (from ~150) Brickman+ 2020

~10k SuperLuminous Supernovae (from ~200)

~400 strongly lensed SN Ia (from 10) Ardense+24

~50 kilonovae (from 2) Setzer+19, Andreoni+19 (+ ToO)

> 10 Interstellar Objects fom 2.... ?)

Rubin LSST Transients by the numbers

17B stars (x10) Ivezic+19

~10 million QSO (x10) Mary Loli+21

~50k Tidal Disruption Events (from ~150) Brickman+ 2020

~10k SuperLuminous Supernovae (from ~200) Villar+ 2018

~400 strongly lensed SN Ia (from 10) Ardense+24

~50 kilonovae (from 2) Setzer+19, Andreoni+19 (+ ToO)

> 10 Interstellar Objects fom 2.... ?)

Rubin LSST Transients by the numbers

17B stars (x10) Ivezic+19

~10 million QSO (x10) Mary Loli+21

~50k Tidal Disruption Events (from ~150) Brickman+ 2020

~10k SuperLuminous Supernovae (from ~200) Villar+ 2018

~400 strongly lensed SN Ia (from 10) Ardense+24

~50 kilonovae (from 2) Setzer+19, Andreoni+19 (+ ToO)

> 10 Interstellar Objects fom 2.... ?)

Rubin LSST Transients by the numbers

17B stars (x10) Ivezic+19

~10 million QSO (x10) Mary Loli+21

~50k Tidal Disruption Events (from ~150) Brickman+ 2020

~10k SuperLuminous Supernovae (from ~200) Villar+ 2018

~400 strongly lensed SN Ia (from 10) Ardense+24

~50 kilonovae (from 2) Setzer+19, Andreoni+19 (+ ToO)

> 10 Interstellar Objects fom 2.... ?)

SKA

(2025)

17B stars (x10) Ivezic+19

~10 million QSO (x10) Mary Loli+21

~50k Tidal Disruption Events (from ~150) Brickman+ 2020

~10k SuperLuminous Supernovae (from ~200) Villar+ 2018

~400 strongly lensed SN Ia (from 10) Ardense+24

~50 kilonovae (from 2) Setzer+19, Andreoni+19 (+ ToO)

> 10 Interstellar Objects fom 2.... ?)

True Novelties!

Rubin LSST Transients by the numbers

Discovery

Distribution

Classification

Data Integration and Follow up

Ensamble Inference

Prediction

Discovery of Novelties

(A.K.A science!)

Discovery

Distribution

Classification

Data Integration and Follow up

Ensamble Inference

Prediction

Discovery of Novelties

(A.K.A science!)

in <60 seconds:

Difference Image Analysis

in <60 seconds:

Difference Image Analysis

Can we replace DIA with ANN?

TANSINET: Sedhagat + Mahabal 2017

in 60 seconds:

Difference Image Analysis + Bogus rejection

feature extraction + Random Forest

AUTOSCAN: Goldstein+ 2017

96% accurate

Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow

92% accurate

Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow

What is the network learning?

What can we learn from the AI?

search

template

difference

template

search

Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow

What is the network learning?

What can we learn from the AI?

Tatiana Acero-Cuellar, UNIDEL fellow, LSSTC data science fellow

Interpretable AI

Robust AI

Anomaly detection

Distribution

Classification

Data Integration and Follow up

Ensamble Inference

Prediction

Discovery of Novelties

(A.K.A science!)

Discovery

The Automatic Learning for the Rapid Classification of Events (ALeRCE) Alert Broker

F. Förster et al 2021 AJ 161 242

AI tasks

Challange

data encoding

well... it depends

2025

(2026)

edge computing

Is the data gonna also be better?

visualizatoin and concept credit: Alex Razim

Kaicheng Zhang et al 2016 ApJ 820 67

deSoto+2024

https://plasticc.org/data-release

Boone 2017

7% of LSST data

Boone 2017

7% of LSST data

The rest

Distribution

Classification

Data Integration and Follow up

Ensamble Inference

Prediction

Discovery of Novelties

(A.K.A science!)

Discovery

federica bianco - fbianco@udel.edu

Rubin will see ~1000 SN every night!

Credit: Alex Gagliano IAIFI fellow MIT/CfA

Photometric Classification of transients

KIC 3858884: A hybrid δ Scuti pulsator in a highly eccentric eclipsing binary Maceroni+2014

Photometric Classification of transients

Kepler satellite EB

LSST (simulated) EB

lightcurves make really bad tensors

is transient data AI ready?

lightcurves make really bad tensors

Variable sizes of data vectors

is transient data AI ready?

lightcurves make really bad tensors

Variable sizes of data vectors

is transient data AI ready?

Variable sizes of data vectors
Uneven sampling

Variable sizes of data vectors
Uneven sampling

lightcurves make really bad tensors

Variable sizes of data vectors

is transient data AI ready?

Variable sizes of data vectors
Uneven sampling

Variable sizes of data vectors
Uneven sampling

Variable sizes of data vectors
Uneven sampling
Different sampling at different wavelengths

lightcurves make really bad tensors

is transient data AI ready?

Variable sizes of data vectors
Uneven sampling
Different sampling at different wavelengths
Phase gaps can be months long over ~1 year

lightcurves make really bad tensors

is transient data AI ready?

Variable sizes of data vectors
Uneven sampling
Different sampling at different wavelengths
Phase gaps can be months long over ~1 year
Multiple relevant time scales

lightcurves make really bad tensors

is transient data AI ready?

Variable sizes of data vectors
Uneven sampling
Different sampling at different wavelengths
Phase gaps can be months long over ~1 year
Multiple relevant time scales
Heteroscedastic errors

GAIA

Olling+2015

Willow Fox Fortino

UDelaware

Optimal deep learning architectures for transients' spectral classification

As seen in Muthukrishna+2019

Kaggle PLAsTiCC challenge

AVOCADO classifier

https://arxiv.org/abs/1907.04690

Classification from sparse data: Lightcurves

The PLAsTiCC challenge winnre, Kyle Boone was a grad student at Berkeley, and did not sue a Neural Network!

He won $2,000

Kaggle PLAsTiCC challenge

AVOCADO classifier

https://arxiv.org/abs/1907.04690

Lochner et al 2018

https://arxiv.org/pdf/1812.00515.pdf

Text

Classification from sparse data: Lightcurves

Kaggle PLAsTiCC challenge

AVOCADO classifier

https://arxiv.org/abs/1907.04690

Text

Classification from sparse data: Lightcurves

without redshift

with redshift

Kaggle PLAsTiCC challenge

AVOCADO classifier

https://arxiv.org/abs/1907.04690

Classification from sparse data: Lightcurves

without redshift

with redshift

Kaggle PLAsTiCC challenge

AVOCADO classifier

https://arxiv.org/abs/1907.04690

Classification from sparse data: Lightcurves

without redshift

with redshift

Methodological issues with these approaches

CNNs are not designed to ingest uncertainties. Passing them as an image layer "works" but it is not clear why since the convolution on the flux and error space are averaged after the first layer

https://medium.com/deeplearningmadeeasy/how-to-add-uncertainty-to-your-neural-network-afb5f855e66a

Gaussian processes work by imposing a kernel that represents the covariance in the data (how data depend on time or time/wavelength). Imposing the same kernel for different time-domain phenomena is principally incorrect

=> bias toward known classes

Methodological issues with these approaches

Neural processes replace the imposed kernel with a learned one

Garnelo+2018

2017

Kaggle PLAsTiCC challenge

AVOCADO classifier

https://arxiv.org/abs/1907.04690

Classification from sparse data: Lightcurves

Viswani 2017 Attention is all you need

Kaggle PLAsTiCC challenge

AVOCADO classifier

https://arxiv.org/abs/1907.04690

Classification from sparse data: Lightcurves

Viswani 2017 Attention is all you need

Willow Fox Fortino

UDelaware

When they go high, we go low

Classification power vs spectral resolution for SNe subtypes

Olling+2015

Time series + low resolution spectrophotometry (R~3)

Precision Photometry (broad optical bands (G, BP, and RP) with space-based precision but bright magnitude limit (g~21)

Challange

benchmark datasets

we badly need better benchmark datasets

Hlozek et al, 2020

DATA CURATION IS THE BOTTLE NECK

models contributed by the community were in

- different format (spectra, lightcurves, theoretical, data-driven)

- the people that contributed the models were included in 1 paper at best

- incompleteness

- systematics

- imbalance

khakpash+ 2024 showed that the models were biased for SN Ibc

AVOCADO, SCONE, all these models are trained on a biased dataset and are being currently used for classification

Ibc data-driven templates vs PLAsTiCC

khakpash+ 2024 showed that the models were biased for SN Ibc

AVOCADO, SCONE, all these models are trained on a biased dataset and are being currently used for classification

Ibc data-driven templates vs PLAsTiCC

AI assisted modelling

Tardis uses a neural network to replace the radiative transfer model

AI-assisted superresolution cosmological simulations

Yin Li+2021

LOW RES SIM

HIGH RES SIM

AI-assisted superresolution cosmological simulations

Yin Li+2021

LOW RES SIM

HIGH RES SIM

AI-AIDED HIGH RES

multimodal data analysis

and pixel to science

2022

why not images too?

latent space

lightcurve latent space rep

image

latent space rep

SN 2018cow

SN2024uwq

Perley+2018

lightcurve latent space rep

image

latent space rep

SN 2018cow

Perley+2018

SN 2018cow

Perley+2018

survey optimization

Opportunity

Rubin LSST survey design

Rubin has involved the community to an unprecedented level in survey design this is a uniquely "democratic" process!

2024

AI for survey design

Challange

little amount of data

-infinity - 1950's

theory driven: little data, mostly theory, falsifiability and all that...

-1980's - today

data driven: lots of data, drop theory and use associations, black-box modles

lots of data yet not enough for entirely automated decision making

complex theory that cannot be solved analytically

combine it with some theory

PiNN

Non Linear PDEs are hard to solve!

Provide training points at the boundary with calculated solution (trivial cause we have boundary conditions)

Provide the physical constraint: make sure the solution satisfies the PDE

via a modified loss function that includes residuals of the prediction and residual of the PDE

\mathrm{loss} = L2 + PDE =\\ \sum(u_\theta - u)^2 + \\ (\partial_t u_\theta + u_\theta \, \partial_x u_\theta - (0.01/\pi) \, \partial_{xx} u_\theta)^2\\

PiNN

Non Linear PDEs are hard to solve!

\mathrm{loss} = L2 + PDE =\\ \sum(u_\theta - u)^2 + \\ (\partial_t u_\theta + u_\theta \, \partial_x u_\theta - (0.01/\pi) \, \partial_{xx} u_\theta)^2\\

Raissi, Perdikaris, Karniadakis 2017

https://www.epa.gov/energy/greenhouse-gas-equivalencies-calculator#results

late layers learn complex aggregate specialized features

early layers learn simple generalized features (like lines for CNN)

prediction "head"

original data

trained extensively on large amounts of data to solve generic problems

Foundational AI models

trained extensively on large amounts of data to solve generic problems

Foundational AI models

We use the ILSVRC-2012 ImageNet dataset with 1k classes
and 1.3M images, its superset ImageNet-21k with
21k classes and 14M images and JFT with 18k classes and
303M high-resolution images.

Typically, we pre-train ViT on large datasets, and fine-tune to (smaller) downstream tasks. For
this, we remove the pre-trained prediction head and attach a zero-initialized D × K feedforward
layer, where K is the number of downstream classe

trained extensively on large amounts of data to solve generic problems

Foundational AI models

We use the ILSVRC-2012 ImageNet dataset with 1k classes
and 1.3M images, its superset ImageNet-21k with
21k classes and 14M images and JFT with 18k classes and
303M high-resolution images.

Typically, we pre-train ViT on large datasets, and fine-tune to (smaller) downstream tasks. For
this, we remove the pre-trained prediction head and attach a zero-initialized D × K feedforward
layer, where K is the number of downstream classe

ethics of AI

Challange + Opportunity

Knowledge is power

Astrophysical data is a sandbox. It has no social value, no privacy risk. We can safely learn about how bias builds into algorithm and how to correct it

Knowledge is power

Astrophysical data is a sandbox. It has no social value, no privacy risk. We can safely learn about how bias builds into algorithm and how to correct it
Ethics of AI is a critical element of the education of a technologist

With great power comes grteat responsibility

"Sharing is caring"

Astrophysical data is a sandbox. It has no social value, no privacy risk. We can safely learn about how bias builds into algorithm and how to correct it
Ethics of AI is a critical element of the education of a technologist
AI is a transferable skill - use if for good!

the butterfly effect

We use astrophyiscs as a neutral and safe sandbox to learn how to develop and apply powerful tool.

Deploying these tools in the real worlds can do harm.

Ethics of AI is essential training that all data scientists shoudl receive.

Joy Boulamwini

models are neutral, the bias is in the data (or is it?)

is a word I am borrowing from Margaret Atwood to describe the fact that the future is us.

However loathsome or loving we are, so will we be.

Whereas utopias are the stuff of dream dystopias are the stuff of nightmares, ustopias are what we create together when we are wide awake

https://www.youtube.com/watch?v=QO3nY_u6hos

US-TOPIA

thank you!

University of Delaware

Department of Physics and Astronomy

Biden School of Public Policy and Administration

Data Science Institute

federica bianco

fbianco@udel.edu

bit.ly/biancotfs25

Challenges in Space-Based Observations

Limited Field of View: Space telescopes often have smaller fields of view compared to ground-based surveys.
Data Latency: Delays in data transmission and processing can affect rapid follow-up.
Resource Allocation: Competition for telescope time can limit observations of certain transients.... LETS NOT TRIGGER 3 ToOs ON THE SAME TRANSIENT!!

(RacusinRacusin et al., 2008et al., 2008

GRB 080319B, the brightest optical burst ever observed

SWIFT

rapid response

Lin+ 2023

SWIFT

HST, Chandra, SPITZER

...

Kepler, K2, TESS

high precision dense time series

Olling+2015

Opportunities and Challenges of Machine Learning
and AI for the next-generation astronomical survey

NEW NAVY DEVICE LEARNS BY DOING; Psychologist Shows Embryo of Computer Designed to Read and Grow Wiser

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

The Automatic Learning for the Rapid Classification of Events (ALeRCE) Alert Broker

federica bianco - fbianco@udel.edu

Photometric Classification of transients

Photometric Classification of transients

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

we badly need better benchmark datasets

AI-assisted superresolution cosmological simulations

Yin Li+2021

AI-assisted superresolution cosmological simulations

Yin Li+2021

Rubin LSST survey design

AI for survey design

PiNN

PiNN

PiNN

models are neutral, the bias is in the data (or is it?)

Challenges in Space-Based Observations

Transients from Space

Transients from Space

federica bianco PRO

Opportunities and Challenges of Machine Learning and AI for the next-generation astronomical survey

NEW NAVY DEVICE LEARNS BY DOING; Psychologist Shows Embryo of Computer Designed to Read and Grow Wiser

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

Rubin LSST Transients by the numbers

The Automatic Learning for the Rapid Classification of Events (ALeRCE) Alert Broker

federica bianco - fbianco@udel.edu

Photometric Classification of transients

Photometric Classification of transients

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

lightcurves make really bad tensors

we badly need better benchmark datasets

AI-assisted superresolution cosmological simulations

Yin Li+2021

AI-assisted superresolution cosmological simulations

Yin Li+2021

Rubin LSST survey design

AI for survey design

PiNN

PiNN

PiNN

models are neutral, the bias is in the data (or is it?)

Challenges in Space-Based Observations

Transients from Space

More from federica bianco

Opportunities and Challenges of Machine Learning
and AI for the next-generation astronomical survey