Apr 10, 2025
Adam Wei
Joint work with Abhinav Agarwal, Boyuan Chen, Rohan Bosworth, Nicholas Pfaff, Russ Tedrake
Figure AI
Toyota Research Institute
\(\pi_0\)
... but where are the robots in the real world?
Gemini Robotics
Credit: Kevin Black (2024)
Big data
Big transfer gap
Small data
No transfer gap
Ego-Exo
robot teleop
Open-X
simulation
How can we obtain data for imitation learning?
(ex. sim & real)
"Scaling Up, Distilling Down"
"OPTIMUS"
Data Generation
Data Augmentation
Sim Infrastructure
"RoboCasa"
"Scalable Real2Sim..."
"Physics-Driven Data Generation..."
Simulated data has a lot to offer...
... we should understand how to use it.
Cotraining: Use both datasets to train a model that maximizes some real-world performance objective
Cotraining: Use both datasets to train a model that maximizes some real-world performance objective
Objective:
Success rate on planar pushing from pixels
Cotraining: Use both datasets to train a model that maximizes some real-world performance objective
Datasets:
Model:
Diffusion Policy [2]
[1] Graedsal et. al, "Towards Tight Convex Relaxations For Contact-Rich Manipulation"
[2] Chi et. al "Diffusion Policy: Visuomotor Policy Learning via Action Diffusion"
Real Data
Sim Data
Objective:
Success rate on planar pushing from pixels
1. Does sim-and-real cotraining improve performance?
2. How does performance scale with data? What is the optimal mixing ratio?
3. What qualities matter for synthetic data?
4. What are some underlying mechanisms in cotraining?
50 real demos
50 real demos
2000 sim demos
Success rate: 10/20
Success rate: 18/20
1.8x improvement!
10 real demos
10 real demos
2000 sim demos
Success rate: 2/20
Success rate: 14/20
7x improvement!
1. Does sim-and-real cotraining improve performance?
2. How does performance scale with data? How does \(\alpha\) affect performance?
3. What qualities matter for synthetic data?
4. What are some underlying mechanisms in cotraining?
Real-world eval is time-consuming and high variance:
Investigating scaling with real-world eval is time consuming & expensive
sim2real \(\approx\) sim2target
Design target sim to emulate
Sim Demo
Target Sim Demo
Key idea: Analyze data scaling for cotraining entirely in sim
1. Does sim-and-real cotraining improve performance?
2. How does performance scale with data? How does \(\alpha\) affect performance?
3. What qualities matter for synthetic data?
4. What are some underlying mechanisms in cotraining?
Should I be investing in my physics engine or my renderer?
Increasing color shift
Target Color
How do different sim2real gaps affect cotraining?
Example: Analyzing color shift
Experiment: Vary color shift and analyze the downstream policies
We investigate 6 sim2real gaps:
Key Findings
(for planar pushing...)
1. Does sim-and-real cotraining improve performance?
2. How does performance scale with data? How does \(\alpha\) affect performance?
3. What qualities matter for synthetic data?
4. What are some underlying mechanisms in cotraining?
Real-World Demo
Cotrained Policy (50 real, 2000 sim)
Simulated Demo
2x
2x
2x
\(\implies\) policies can distinguish sim & real
\(\implies\) policies behave differently in sim & real
High-performing policies must learn to identify sim vs real
since the physics of each environment requires different actions
High-performing policies must learn to identify sim vs real
since the physics of each environment requires different actions
Real data: high-level strategies and low-level control
Sim data: Fill in missing gaps from the real data, prevents overfitting, provides stronger action priors
Real data: high-level strategies and low-level control
Sim data: Fill in missing gaps from the real data, prevents overfitting, provides stronger action priors
\(\implies\)
Sim demo worth ~0.49-0.83 real demos
Scaling sim reduces test loss & MSE in real!
Zhenyu Jiang
How can we scalably obtain simulation assets?
Nicholas Pfaff
How can we scalably obtain simulation assets?
Nicholas Pfaff
How can we scalably obtain simulation assets?
Nicholas Pfaff
Lujie Yang
How can we generate contact-rich & cross-embodied robot data?
Lujie Yang
How can we generate contact-rich & cross-embodied robot data?
I believe that simulation will play a major role in the future of robot foundation models
Lujie Yang
Nicholas Pfaff
Scalable Real2sim
Physics-Driven Data Generation