Principles of Sim-and-Real Cotraining For Robot Manipulation

Apr 10, 2025

Adam Wei

Joint work with Abhinav Agarwal, Boyuan Chen, Rohan Bosworth, Nicholas Pfaff, Russ Tedrake

Imitation Learning In Robotics

Figure AI

Toyota Research Institute

Imitation Learning In Robotics

\(\pi_0\)

... but where are the robots in the real world?

Gemini Robotics

Credit: Kevin Black (2024)

Robot Data Diet

Big data

Big transfer gap

Small data

No transfer gap

Ego-Exo

 robot teleop

Open-X

simulation

How can we obtain data for imitation learning?

(ex. sim & real)

Synthetic Data Generation

"Scaling Up, Distilling Down"

"OPTIMUS"

Data Generation

Data Augmentation

Sim Infrastructure

"RoboCasa"

"Scalable Real2Sim..."

"Physics-Driven Data Generation..."

Sim-and-Real Cotraining

Simulated data has a lot to offer...

... we should understand how to use it.

Sim-and-Real Cotraining

\red{\mathcal{D}_{R}\sim p_{R}(O,A)}
\blue{\mathcal{D}_S\sim p_{S}(O,A)}

Cotraining: Use both datasets to train a model that maximizes some real-world performance objective

Experimental Setup

Cotraining: Use both datasets to train a model that maximizes some real-world performance objective

Objective:

Success rate on planar pushing from pixels

  • Focusing on single task enables controlled experiments and thorough analysis
    • more on this later...

Experimental Setup

Cotraining: Use both datasets to train a model that maximizes some real-world performance objective

Datasets:

Model:

Diffusion Policy [2]

\mathcal L_{\mathcal D^\alpha} = \alpha \textcolor{red}{\mathcal L_{\mathcal D_R}} + (1-\alpha) \textcolor{blue}{\mathcal L_{\mathcal D_S}}

[1] Graedsal et. al, "Towards Tight Convex Relaxations For Contact-Rich Manipulation"

 

[2] Chi et. al "Diffusion Policy: Visuomotor Policy Learning via Action Diffusion"

Real Data

Sim Data

Objective:

Success rate on planar pushing from pixels

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? What is the optimal mixing ratio?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

Does Cotraining Improve Performance?

50 real demos

50 real demos

2000 sim demos

Success rate: 10/20

Success rate: 18/20

1.8x improvement!

2x

2x

Does Cotraining Improve Performance?

10 real demos

10 real demos

2000 sim demos

Success rate: 2/20

Success rate: 14/20

7x improvement!

2x

2x

Real World Cotraining Results

  • Cotraining improves policy performance by up to 2-7x
  • Scaling sim data improves performance and reduces sensitivity to \(\alpha\)
  • What happens when we continue scaling sim data?
\mathcal L_{\mathcal D^\alpha} = \alpha \textcolor{red}{\mathcal L_{\mathcal D_R}} + (1-\alpha) \textcolor{blue}{\mathcal L_{\mathcal D_S}}

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? How does \(\alpha\) affect performance?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

Bottleneck: Real World Eval

Real-world eval is time-consuming and high variance:

  • 20 trials per policy (½-1hr) \(\implies\)large error bars

Investigating scaling with real-world eval is time consuming & expensive

  • ... especially a PhD student

Sim-Sim Cotraining

sim2real \(\approx\) sim2target

Design target sim to emulate

  • physics gap
  • visual gap
  • action gap

Sim-Sim Cotraining

Key idea: Analyze data scaling for cotraining entirely in sim

  1. Automated, controlled, & high-confidence evaluations
  2. Explicit control over the sim2target gap

Scaling Sim Data

  • Performance gains from scaling sim data plateau; additional real data raises the performance ceiling
  • Sim is valuable! But cannot fully replace real data

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? How does \(\alpha\) affect performance?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

Distribution Shift Experiments

Should I be investing in my physics engine or my renderer?

Increasing color shift

Target Color

How do different sim2real gaps affect cotraining?

Example: Analyzing color shift

Experiment: Vary color shift and analyze the downstream policies

Distribution Shift Experiments

We investigate 6 sim2real gaps:

  • Visual shifts: color shift, color randomization, camera shift
  • Physical shifts: center of mass shift
  • Task shifts: goal shifts, object shifts

Key Findings

  • All shifts reduce performance; physics and task shift are most impactful
  • Paradoxically, some visual shift is required for good performance!

(for planar pushing...)

Research Questions

1. Does sim-and-real cotraining improve performance?

2. How does performance scale with data? How does \(\alpha\) affect performance?

3. What qualities matter for synthetic data?

4. What are some underlying mechanisms in cotraining?

SDE Interpretation

Sim-and-Real Discernability

Real-World Demo

Cotrained Policy (50 real, 2000 sim)

Simulated Demo

  • Fix orientation, then translation
  • Sticking & sliding contacts
  • Similar to real-world demo
  • Fix orientation and translation simultaneously
  • Sticking contacts only

2x

2x

2x

SDE Interpretation

Sim-and-Real Discernability

\(\implies\) policies can distinguish sim & real

\(\implies\) policies behave differently in sim & real

High-performing policies must learn to identify sim vs real

since the physics of each environment requires different actions

SDE Interpretation

Sim-and-Real Discernability

High-performing policies must learn to identify sim vs real

since the physics of each environment requires different actions

SDE Interpretation

Positive Transfer: Data Coverage

Real data: high-level strategies and low-level control

Sim data: Fill in missing gaps from the real data, prevents overfitting, provides stronger action priors

SDE Interpretation

Positive Transfer: Power Law

\(\implies\)

Sim demo worth ~0.49-0.83 real demos

Scaling sim reduces test loss & MSE in real!

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Are results limited to single-task?

Zhenyu Jiang

  • Concurrent & independent work shows similar results!
  • Strong signal that our analysis extends to the multi-task setting

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Scalable Real2Sim

How can we scalably obtain simulation assets?

Nicholas Pfaff

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Scalable Real2Sim

How can we scalably obtain simulation assets?

Nicholas Pfaff

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Scalable Real2Sim

How can we scalably obtain simulation assets?

Nicholas Pfaff

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Physics-Driven Data Generation

Lujie Yang

How can we generate contact-rich & cross-embodied robot data?

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Physics-Driven Data Generation

Lujie Yang

How can we generate contact-rich & cross-embodied robot data?

SDE Interpretation

Positive Transfer: Power Laws

SDE Interpretation

Future Directions

I believe that simulation will play a major role in the future of robot foundation models

  1. New algorithms inspired by our analysis
  2. Cotraining from non-robot data
  3. Understanding cotraining...

Thank You!

Lujie Yang

Nicholas Pfaff

Scalable Real2sim

Physics-Driven Data Generation

Amazon CoRo Symposium 2025

By weiadam

Amazon CoRo Symposium 2025

Slides for my talk at the Amazon CoRo Symposium 2025. For more details, please see the paper: https://arxiv.org/abs/2503.22634

  • 123