François Lanusse
CNRS Researcher @ AIM, CEA Paris-Saclay
Polymathic AI
astro-ph abstracts mentioning Deep Learning, CNN, or Neural Networks
The vast majority of these results has relied on supervised learning and networks trained from scratch.
=> Limits in practice the ease of using deep learning for analysis and discovery
Transposing these methodologies to scientific data and problems brings
unique challenges
Credit: Melchior et al. 2021
Credit:DESI collaboration/DESI Legacy Imaging Surveys/LBNL/DOE & KPNO/CTIO/NOIRLab/NSF/AURA/unWISE
Collaborative project with about 30 contributors
Presented at NeurIPS 2024 Datasets & Benchmark track
Ground-based imaging from Legacy Survey
Space-based imaging from JWST
Presented at NeurIPS 2024 Datasets & Benchmark Track
Presented at NeurIPS 2024 Datasets & Benchmark Track
Most General
Most Specific
Single model capable of processing all types of data
Independent models for all types of data
Most General
Most Specific
Independent models for all types of data
Single model capable of processing all types of data
Bytes Are All You Need (Horton et al. 2023)
Most General
Most Specific
Independent models for all types of data
Single model capable of processing all types of data
Bytes Are All You Need (Horton et al. 2023)
AstroCLIP (Parker et al. 2024)
AstroCLIP
Most General
Most Specific
Independent models for all types of data
Single model capable of processing all types of data
Bytes Are All You Need (Horton et al. 2023)
Early Fusion Multimodal Models
AstroCLIP (Parker et al. 2024)
Flamingo: a Visual Language Model for Few-Shot Learning (Alayrac et al. 2022)
Chameleon: Mixed-Modal Early-Fusion Foundation Models (Chameleon team, 2024)
Accepted at NeurIPS 2025, spotlight presentation at NeurIPS 2025 AI4Science Workshop
Project led by:
Francois
Lanusse
Liam
Parker
Jeff
Shen
Tom
Hehir
Ollie
Liu
Lucas
Meyer
Sebastian Wagner-Carena
Helen
Qu
Micah
Bowles
(Blanco Telescope and Dark Energy Camera.
Credit: Reidar Hahn/Fermi National Accelerator Laboratory)
(Subaru Telescope and Hyper Suprime Cam. Credit: NAOJ)
(Dark Energy Spectroscopic Instrument)
(Sloan Digital Sky Survey. Credit: SDSS)
(Gaia Satellite. Credit: ESA/ATG)
Accepted at NeurIPS 2025 Machine Learning for Physical Sciences Workshop
Jeff Shen
(credit)
Survey translation
Spectrum super-resolution
Conventional scientific workflow with deep learning
Conventional researchers @ CMU
Circa 2016
CMU DeepLens (Lanusse et al 2017)
Foundation Model-based Scientific Workflow
Already taken care of
=> Let's discuss embedding-based adaptation
Adaptation at low cost
with simple strategies:
x_train = Tokenize(hsc_images, modality='HSC')
model = FineTunedModel(base='Aion-B',
adaptation='AttentivePooling')
model.fit(x_train, y_train)
y_test = model.predict(x_test)
Inputs:
measured fluxes
Inputs:
measured fluxes + image
Trained on ->
Eval on ->
DiNOv2
Segmenting central bar and spiral arms in galaxy images based on Galaxy Zoo 3D
nDCG@10 score
Spotlight at 2025 NeurIPS AI4Science Workshop
Nolan Koblischke
nDCG@10 score
Polymathic's recipe for developing Multimodal Scientific Models
Engagement with Scientific Communities
Data Curation And Aggregation
Dedicated ML R&D
AION-1 papers will be on the ArXiv next week! and models available for download!
Thank you for listening!