Large Science Models:

Foundation Models for
Generalizable Insights Into Complex Systems

with Psycho-social Application

PI: Ishanu Chattopadhyay, PhD

Assistant Professor of Biomedical Informatics & Computer Science

University of Kentucky

DARPA-EA-25-02-05-MAGICS-PA-025

HR0011-26-3-E016

Mar 2026

Agenda

Progress description
- World Value Survey (>90K participants with global coverage, data acquired, LSM model constructed)
LLM / LSM comparison results
DTAG results
Reflexivity simulation results
Updated paper draft (attached)
Effort re-phasing discussion

World Value Survey (Wave 7)

n = 93,497

Coverage:

Missing Africa, Oceania

LSM:

4,291,876 learned parameters

LSM Digital Twin Performance over LLM (GPT5.4)

LLM somewhat competitive when tracking the most frequent behavior

Higher is better

baseline: assumes item independence

LSM Digital Twin Performance over LLM (GPT5.4)

LSM substantially better as a "Digital Twin", for replicating all behaviors

Lower is better

baseline: assumes item independence

DTAG: Global Digital Twin of Opinions

place

ethnicity

gender

time

LLM

LSM

a. Query

b. digitization

c. LSM response

d. virtual opinion

DTAG: Digital Twin Anchored Generation v0.0.1

DTAG: Global Digital Twin of Opinions

python3 ./pipeline6.py --qnet ../survey/models/gss/gss_2022female.pkl.gz --map maps/map2022.csv --persona "22 year old white female without children  in urban New York, regular news consumer, working in retail, highly progressive" --openai_model gpt-4.1 --polar assets/polar_vectors.csv --auto assets/increase_set_1_border_crime.csv

python3 ./pipeline6.py --qnet ../survey/models/gss/gss_2022male.pkl.gz --map maps/map2022.csv --persona "45 year old white male with children  in rural Alabama, regular news consumer, working in farming, veteran, conservative"  --openai_model gpt-4.1  --polar assets/polar_vectors.csv --auto assets/increase_set_1_border_crime.csv

python3 pipeline5iloc.py --qnet ../survey/models/wvs/LSM10K.gz --map maps/wvs7_variable_question_map.csv --persona "urban, regular news consumer, small business owner"  --openai_model gpt-5.4-mini  --assign_prefilter 500 --year 2023 --country China

python3 pipeline5iloc.py --qnet ../survey/models/wvs/LSM10K.gz --map maps/wvs7_variable_question_map.csv --persona "urban, regular news consumer, small business owner"  --openai_model gpt-5.4-mini  --assign_prefilter 500 --year 2023 --country "Middle East"

DTAG: Global Digital Twin of Opinions

python3 ./pipeline6.py --qnet ../survey/models/gss/gss_2022female.pkl.gz --map maps/map2022.csv --persona "22 year old white female without children  in urban New York, regular news consumer, working in retail, highly progressive" --openai_model gpt-4.1 --polar assets/polar_vectors.csv --auto assets/increase_set_1_border_crime.csv

python3 ./pipeline6.py --qnet ../survey/models/gss/gss_2022male.pkl.gz --map maps/map2022.csv --persona "45 year old white male with children  in rural Alabama, regular news consumer, working in farming, veteran, conservative"  --openai_model gpt-4.1  --polar assets/polar_vectors.csv --auto assets/increase_set_1_border_crime.csv

Recall Bail etal.

“Exposure to opposing views on social media can increase political polarization” by Christopher A. Bail et al., published in PNAS in September 2018 (Vol. 115, No. 37, pp. 9216–9221; DOI: 10.1073/pnas.1804840115)

Perturbing with opposing views made conservatives more conservative (statistically significant), liberals more liberal (not statistically significant)

DTAG: Global Digital Twin of Opinions

positive: conservative, negative: liberal

Digital twin for 2022 GSS

Cost & Schedule

Estimated costs	USD
Labor cost	157,227.86
Other direct costs	9,993.00
Total (direct+indirects for 12 months)	257,520.12

Validation Plan Outline

Gantt Chart*

*Milestone definitions in next slide

Dataset Acquisition (10 survey datasets)

LSM inference

LSM predictive ability validation

LSM model drift sense validation

LSM data sufficiency tracking validation

LSM mediated social theory analysis

Milestones

1	Kickoff Meeting: A briefing on the technical plan for the effort to include milestone schedule and path to accomplish the objectives of the agreement.	Government acceptance / Kickoff meeting briefing slides	Month 1 after award start
2	Validation plan: Detailed validation plan, including description,acquisition plan, and justification for the ground truth data, and description of the metricsand benchmarks to be used to measure performance.	Government acceptance / Technical report as described.	Month 1
3	Milestone Title: Dataset Acquisition and LSM Inference Technical goal: a) Dataset acquisition (10 social survey datasets acquired: GSS, ANES, CES, Eurobarometer etc) b) Infer LSM models for each dataset using 50% random samples, multiple LSMs trained with different random splits for each dataset.	Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success	Month 2
4	Milestone Title: Masked sample reconstruction Technical goal: LSM predictive accuracy validation via censored sample reconstruction validation on out of sample data from each dataset, Demonstrate statistically significant reduction of LSM distance post reconstruction relative to post-masking. Target: Reconstruction metric error at least 50% improvement over 1) random imputation 2) median imputation	Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success	Month 4
5	Milestone Title: Model drift sensing validation Technical goal: Demonstrate that LSM framework can reliably sense when underlying model drifts. Assess if the model drift statistic is stationary from samples drawn from the same survey wave of our datasets, and reliably indicates non-stationary drift for samples from different survey waves. Target: Model drift statistic must have statistical significance at 5% level for survey waves 5 years apart for at least GSS, CES and Eurobarometer Deliverable are detailed documentation on all 10 datasets	Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success	Month 6
6	Milestone Title: Data sufficiency assessment capability Technical goal: Use the conservation of complexity principle to show that LSM framework can sense data deficiency and sufficiency.	Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success.Analysis results on all 10 datasets	Month 8
7	Milestone Title: Social Theory and Competing Hypotheses Adjudication Technical goal: a) Social Theory Hypothesis Assessment: Polarization is an inevitable attractor b) Investigate the competing hypotheses that socio-economic identity vs belief proximity and latent opinion space geometry is more predictive of specific opinion / belief outcomes	Government acceptance / Technical report detailing figure/code/data/etc. and all underlying materials generated in support of milestone, regardless of success	Month 10
8	Final milestone meeting and report (one month prior to award end date): The final briefing and final report should summarize all work completed on the project, highlighting accomplishments, lessons learned, unexpected outcomes, and challenges requiring further Research. Technical artifact delivery (Software release, evaluation results, source code, models, etc.)	Government acceptance / Technical report as described.For software: Github repository with deployable code complete with example notebooks	Month 11

Milestone Title / Detailed Description

Exit Criteria /Deliverable

Due Date (nlt)

Milestone #

\(\checkmark\)

Next Meeting

Progress on survey models (aiming to complete >50% of LSM inference)
Masked reconstruction results
Theory on synthetic data performance and sample complexity

LSM-MAGICS

By Ishanu Chattopadhyay

LSM-MAGICS

DARPA-EA-25-02-05-MAGICS-PA-025 PI/PM Meeting

Ishanu Chattopadhyay PRO

ML | Data Science Biomedical Informatics | Social Science | Assistant Professor

Large Science Models:

Foundation Models for Generalizable Insights Into Complex Systems

with Psycho-social Application

Agenda

World Value Survey (Wave 7)

Coverage:

LSM:

4,291,876 learned parameters

LSM Digital Twin Performance over LLM (GPT5.4)

Higher is better

LSM Digital Twin Performance over LLM (GPT5.4)

Lower is better

DTAG: Global Digital Twin of Opinions

place

ethnicity

gender

time

DTAG: Global Digital Twin of Opinions

DTAG: Global Digital Twin of Opinions

Recall Bail etal.

DTAG: Global Digital Twin of Opinions

positive: conservative, negative: liberal

Cost & Schedule

Validation Plan Outline

Milestones

Next Meeting

Progress on survey models (aiming to complete >50% of LSM inference)

LSM-MAGICS

LSM-MAGICS

Ishanu Chattopadhyay PRO

More from Ishanu Chattopadhyay

Foundation Models for
Generalizable Insights Into Complex Systems