
AION-1: Omnimodal Foundation Model for Astronomical Sciences
L. Parker, F. Lanusse, J, Shen, O. Liu, T. Hehir, L. Sarra, L. Meyer, M. Bowles, S. Wagner-Carena, H. Qu, S. Golkar, A. Bietti, H. Bourfoune, P. Cornette, K. Hirashima, G. Krawezik, R. Ohana, N. Lourie, M. McCabe, R. Morel, P. Mukhopadhyay, M. Pettee, B. Regaldo-Saint Blancard, K. Cho, M. Cranmer, S. Ho

Surveys Astronomy:
A Data Rich and Multimodal Scientific Domain


Credit:DESI collaboration/DESI Legacy Imaging Surveys/LBNL/DOE, KPNO/CTIO/NOIRLab/NSF/AURA/unWISE
Modern astronomy relies on
large surveys of the sky with a
variety of instruments leading to a
wide diversity of data types.
Our objective: build an Omnimodal Foundation Model that bridges data silos to unlocks cross-modal insights.

120 TB of Training Data, Spanning Diverse Science Cases


(Blanco Telescope and Dark Energy Camera.
Credit: Reidar Hahn/Fermi National Accelerator Laboratory)






(Subaru Telescope and Hyper Suprime Cam. Credit: NAOJ)




(Dark Energy Spectroscopic Instrument)



(Sloan Digital Sky Survey. Credit: SDSS)


(Gaia Satellite. Credit: ESA/ATG)
- Galaxy formation
- Cosmology
- Stellar physics
- Galaxy archaeology
- ...
Standardizing all Modalities Through Tokenization

- For each modality class (e.g. image, spectrum) we build dedicated metadata-aware tokenizers
- For AION-1, we integrate 39 different modalities (different instruments, different measurements, etc.)



Any-to-Any Modeling with Generative Masked Modeling
- Training is done by pairing observations of the same objects from different instruments.
- Model is trained by cross-modal generative masked modeling (Mizrahi et al. 2023)
=> Learns the joint and all conditional distributions of provided modalities:




AION-1 Family of Models


Models trained on the Jean Zay 4 Supercomputer (1400 H100 GPUs)




Example of Out-of-the-Box Capabilities


Survey translation

Spectrum super-resolution




Adaptation to Downstream Scientific Use-Cases







DiNOv2

Example-Based Retrieval of Rare Objects


nDCG@10 score



From a pool of 500,000 candidate lenses...

Please Come See Us at Our Poster for More !
Thank you for watching !
NeurIPS AION-1
By eiffl
NeurIPS AION-1
Short talk for NeurIPS
- 115