Large Science Models:

Foundation Models for
Generalizable Insights Into Complex Systems

with Psycho-social Application 

PI: Ishanu Chattopadhyay, PhD

Assistant Professor of Biomedical Informatics & Computer Science

University of Kentucky

DARPA-EA-25-02-05-MAGICS-PA-025

HR0011-26-3-E016

Apr 2026

Agenda

  1. Progress description
    • Afrobarometer (>350K participants in 34 countries in Africa, LSM model construction in progress)
  2. Prolific survey design
  3. Milestone reports 2 and 3 will be shared by next week

 

  1. Beyond MAGICS

Current Coverage

surveysGSS, Eurobarometer, World value Survey, Afrobarometer
participants4,052,616
countries193
years1972-2025
survey items200-1600
  • GSS
  • Eurobarometer
  • Afrobarometer
  • WVS

Milestones                           

1 Kickoff Meeting: A briefing on the technical plan for the effort to include milestone schedule and path to accomplish the objectives of the agreement. Government acceptance / Kickoff meeting briefing slides Month 1 after award start
2 Validation plan: Detailed validation plan, including description,acquisition plan, and justification for the ground truth data, and description of the metricsand benchmarks to be used to measure performance. Government acceptance / Technical report as described. Month 1
3

Milestone Title: Dataset Acquisition and LSM Inference

Technical goal: a) Dataset acquisition (10 social survey datasets acquired: GSS, ANES, CES, Eurobarometer etc) b) Infer LSM models for each dataset using 50% random samples, multiple LSMs trained with different random splits for each dataset.

Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success Month 2
4

Milestone Title: Masked sample reconstruction

Technical goal: LSM predictive accuracy validation via censored sample reconstruction validation on out of sample data from each dataset, Demonstrate statistically significant reduction of LSM distance post reconstruction relative to post-masking. Target: Reconstruction metric error at least 50% improvement over 1) random imputation 2) median imputation

Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success Month 4
5

Milestone Title: Model drift sensing validation

Technical goal: Demonstrate that LSM framework can reliably sense when underlying model drifts. Assess if the model drift statistic is stationary from samples drawn from the same survey wave of our datasets, and reliably indicates non-stationary drift for samples from different survey waves. Target: Model drift statistic must have statistical significance at 5% level for survey waves 5 years apart for at least GSS, CES and Eurobarometer Deliverable are detailed documentation on all 10 datasets

Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success Month 6
6

Milestone Title: Data sufficiency assessment capability

Technical goal: Use the conservation of complexity principle to show that LSM framework can sense data deficiency and sufficiency.

Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success.Analysis results on all 10 datasets Month 8
7

Milestone Title: Social Theory and Competing Hypotheses Adjudication

Technical goal: a) Social Theory Hypothesis Assessment: Polarization is an inevitable attractor b) Investigate the competing hypotheses that socio-economic identity vs belief proximity and latent opinion space geometry is more predictive of specific opinion / belief outcomes

Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success Month 10
8

Final milestone meeting and report (one month prior to award end date): The final briefing and final report should summarize all work completed on the project, highlighting accomplishments, lessons learned, unexpected outcomes, and challenges requiring further Research.

Technical artifact delivery (Software release, evaluation results, source code, models, etc.)

Government acceptance / Technical report as described.For software: Github repository with deployable code complete with example notebooks Month 11

Milestone Title / Detailed Description

Exit Criteria /Deliverable

Due Date (nlt)

Milestone #

\(\checkmark\)

\(\checkmark\)

DTAG: Global Digital Twin of Opinions

positive: conservative, negative: liberal

Digital twin for 2022 GSS 

*“Exposure to opposing views on social media can increase political polarization” by Christopher A. Bail et al., published in PNAS in September 2018 (Vol. 115, No. 37, pp. 9216–9221; DOI: 10.1073/pnas.1804840115)

In contrast to Bail etal.*,

  • we are not presenting new information
  • we can move opinions up and down by varying the presented querries

Prolific Survey Design

  • Exempt research under 45 CFR 46
  • Minimal-risk research with administrative approval
  • Present different set of predetermined items to show we can move the ideology index in a predicted trend
  • Adaptive questions to show "change in trend" on cue

US participants from Prolific panel

Timeline

approval (6-8 wk)

run 1 (1 week)

run 3 (1 week)

analysis (3 weeks)

6 months

Next Meeting

  • Theory on synthetic data performance and sample complexity
  • New application domains: Emergenet, metabolomic analysis for clinical diagnosis

LSM

Lab-test for ASD with >90% AUC at 1 year 

LSM

Lab-test for Interstitial Lung Disease with blood draw (85-90% AUC)

Metabolomic profile

LSM-MAGICS-APR26

By Ishanu Chattopadhyay

LSM-MAGICS-APR26

DARPA-EA-25-02-05-MAGICS-PA-025 PI/PM Meeting

  • 2