Principles of Sim-and-Real Cotraining For Robot Manipulation

Apr 10, 2025

Adam Wei

Joint work with Abhinav Agarwal, Boyuan Chen, Rohan Bosworth, Nicholas Pfaff, Russ Tedrake

Imitation Learning In Robotics

Figure AI

Toyota Research Institute

Imitation Learning In Robotics

\(\pi_0\)

... but where are the robots in the real world?

Gemini Robotics

Credit: Kevin Black (2024)

Robot Data Diet

Big data

Big transfer gap

Small data

No transfer gap

Ego-Exo

robot teleop

Open-X

simulation

How can we obtain data for imitation learning?

(ex. sim & real)

Synthetic Data Generation

"Scaling Up, Distilling Down"

"OPTIMUS"

Data Generation

Data Augmentation

Sim Infrastructure

"RoboCasa"

"Scalable Real2Sim..."

"Physics-Driven Data Generation..."

Sim-and-Real Cotraining

Simulated data has a lot to offer...

... we should understand how to use it.

Sim-and-Real Cotraining

\red{\mathcal{D}_{R}\sim p_{R}(O,A)}

\blue{\mathcal{D}_S\sim p_{S}(O,A)}

Cotraining: Use both datasets to train a model that maximizes some real-world performance objective

Experimental Setup

Cotraining: Use both datasets to train a model that maximizes some real-world performance objective

Objective:

Success rate on planar pushing from pixels

Focusing on single task enables controlled experiments and thorough analysis
- more on this later...

Experimental Setup

Cotraining: Use both datasets to train a model that maximizes some real-world performance objective

Datasets:

Model:

Diffusion Policy [2]

\mathcal L_{\mathcal D^\alpha} = \alpha \textcolor{red}{\mathcal L_{\mathcal D_R}} + (1-\alpha) \textcolor{blue}{\mathcal L_{\mathcal D_S}}

[1] Graedsal et. al, "Towards Tight Convex Relaxations For Contact-Rich Manipulation"

[2] Chi et. al "Diffusion Policy: Visuomotor Policy Learning via Action Diffusion"

Real Data

Sim Data

Objective:

Success rate on planar pushing from pixels

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? What is the optimal mixing ratio?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

Does Cotraining Improve Performance?

50 real demos

2000 sim demos

Success rate: 10/20

Success rate: 18/20

1.8x improvement!

Does Cotraining Improve Performance?

10 real demos

2000 sim demos

Success rate: 2/20

Success rate: 14/20

7x improvement!

Real World Cotraining Results

Cotraining improves policy performance by up to 2-7x
Scaling sim data improves performance and reduces sensitivity to \(\alpha\)
What happens when we continue scaling sim data?

\mathcal L_{\mathcal D^\alpha} = \alpha \textcolor{red}{\mathcal L_{\mathcal D_R}} + (1-\alpha) \textcolor{blue}{\mathcal L_{\mathcal D_S}}

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? How does \(\alpha\) affect performance?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

Bottleneck: Real World Eval

Real-world eval is time-consuming and high variance:

20 trials per policy (½-1hr) \(\implies\)large error bars

Investigating scaling with real-world eval is time consuming & expensive

... especially a PhD student

Sim-Sim Cotraining

sim2real \(\approx\) sim2target

Design target sim to emulate

physics gap
visual gap
action gap

Ex: Visual gap

Sim Demo

Target Sim Demo

Sim-Sim Cotraining

Key idea: Analyze data scaling for cotraining entirely in sim

Automated, controlled, & high-confidence evaluations
Explicit control over the sim2target gap

Scaling Sim Data

Performance gains from scaling sim data plateau; additional real data raises the performance ceiling
Sim is valuable! But cannot fully replace real data

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? How does \(\alpha\) affect performance?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

Distribution Shift Experiments

Should I be investing in my physics engine or my renderer?

Increasing color shift

Target Color

How do different sim2real gaps affect cotraining?

Example: Analyzing color shift

Experiment: Vary color shift and analyze the downstream policies

Distribution Shift Experiments

We investigate 6 sim2real gaps:

Visual shifts: color shift, color randomization, camera shift
Physical shifts: center of mass shift
Task shifts: goal shifts, object shifts

Key Findings

All shifts reduce performance; physics and task shift are most impactful
Paradoxically, some visual shift is required for good performance!

(for planar pushing...)

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? How does \(\alpha\) affect performance?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

SDE Interpretation

Sim-and-Real Discernability

Real-World Demo

Cotrained Policy (50 real, 2000 sim)

Simulated Demo

Fix orientation, then translation
Sticking & sliding contacts

Similar to real-world demo

Fix orientation and translation simultaneously
Sticking contacts only

SDE Interpretation

Sim-and-Real Discernability

\(\implies\) policies can distinguish sim & real

\(\implies\) policies behave differently in sim & real

High-performing policies must learn to identify sim vs real

since the physics of each environment requires different actions

SDE Interpretation

Sim-and-Real Discernability

High-performing policies must learn to identify sim vs real

since the physics of each environment requires different actions

SDE Interpretation

Positive Transfer: Data Coverage

Real data: high-level strategies and low-level control

Sim data: Fill in missing gaps from the real data, prevents overfitting, provides stronger action priors

SDE Interpretation

[old slide] Positive Transfer: Data Coverage

Real data: high-level strategies and low-level control

Sim data: Fill in missing gaps from the real data, prevents overfitting, provides stronger action priors

SDE Interpretation

Positive Transfer: Power Law

\(\implies\)

Sim demo worth ~0.49-0.83 real demos

Scaling sim reduces test loss & MSE in real!

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Are results limited to single-task?

Zhenyu Jiang

Concurrent & independent work shows similar results!
Strong signal that our analysis extends to the multi-task setting

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Scalable Real2Sim

How can we scalably obtain simulation assets?

Nicholas Pfaff

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Scalable Real2Sim

How can we scalably obtain simulation assets?

Nicholas Pfaff

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Scalable Real2Sim

How can we scalably obtain simulation assets?

Nicholas Pfaff

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Physics-Driven Data Generation

Lujie Yang

How can we generate contact-rich & cross-embodied robot data?

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Physics-Driven Data Generation

Lujie Yang

How can we generate contact-rich & cross-embodied robot data?

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Future Directions

I believe that simulation will play a major role in the future of robot foundation models

New algorithms inspired by our analysis
Cotraining from non-robot data
Understanding cotraining...

Thank You!

Lujie Yang

Nicholas Pfaff

Scalable Real2sim

Physics-Driven Data Generation