Adam Wei
Planar Pushing
Preparing Toast
Flipping Pancakes
Flipping Pages
Peeling Vegetables
Unpacking Boxes
Behavior Cloning Algorithm
Policy
Ex. Diffusion Policy
?
How can we obtain the dataset \(\mathcal{D}\)?
Big data
Big transfer gap
Small data
No transfer gap
Ego-Exo
robot teleop
Open-X
simulation
How can we obtain the dataset \(\mathcal{D}\)?
Cotrain from different data sources (ex. sim and real)
Goal: Manipulate object to target pose
Goal: Manipulate object to target pose
“Diffusion Policy: Visuomotor Policy Learning via Action Diffusion.”, C. Chi et al., RSS 2023
Goal: Manipulate object to target pose
“Diffusion Policy: Visuomotor Policy Learning via Action Diffusion.”, C. Chi et al., RSS 2023
Behavior Cloning Algorithm
Policy
Real-World Data \(\mathcal{D}_{real}\)
Simulated Data \(\mathcal{D}_{sim}\)
...happy to chat afterwards!
1. Global
(A few percent from global optimality)
2. Reliable
(Works 100%
of the time)
3. Efficient
(Scales polynomially, not exponentially)
\( \} \)
Bilinear (nonconvex) constraints
“Shortest Paths in Graphs of Convex Sets,”
Marcucci et al., SIAM OPT. 2024
“Shortest Paths in Graphs of Convex Sets,”
Marcucci et al., SIAM OPT. 2024
Caveats
Fundamental progress towards real-time output-feedback control through contact, but not a complete solution... yet
Robot Poses
A
O
Rollout plans in sim
(render observations)
Instead of running the plans directly on hardware...
Generate plans (GCS)
Cotrain a diffusion policy on both \(\mathcal{D_{real}}\) and \(\mathcal{D_{sim}}\)
Real-time re-planning
Low-level control
No explicit state estimation;
pixels to actions
50 real demos, 500 sim demos
10 real demos
2000 sim demos
50 real demos
500 sim demos
How impactful is cotraining?
Behavior Cloning Algorithm
Policy
Real-World Data \(\mathcal{D}_{real}\)
Simulated Data \(\mathcal{D}_{sim}\)
GCS
(contact-rich tasks & motion planning)
Optimus
Scaling Up Distilling Down
(TAMP & semantic reasoning)
Why study co-training?
Some initial insights...
...happy to chat afterwards!
Connections to other projects