RLG Long Talk
March 8, 2024
Adam Wei
Single Maze
Multi-Maze
Motion Planning Experiments
Planar Pushing
Sim2Real Transfer
Scaling Laws & Generalization
Tesla
1X
...if we can collect enough data, we can solve the task
- Shuran at CoRL
1. How can we get robot data at scale?
Big data
Big transfer
Small data
No transfer
Ego-Exo
robot teleop
Open-X
simulation rollouts
Slide credit: Russ
2. Is scaling all we need?
We need to scale and generalize across:
This is a big ask!
Proposal: Begin by exploring questions about scale directly in simulation
Model-Based Challenges
Behavior Cloning...
Single Maze
Multi-Maze
Motion Planning Experiments
Planar Pushing
Sim2Real Transfer
Scaling Laws & Generalization
Single Maze
Multi-Maze
Motion Planning Experiments
Planar Pushing
Sim2Real Transfer
Scaling Laws & Generalization
Goal: Train a policy to navigate a single maze
Dataset:
GCS settings
Goal: Obtain a policy that can navigate arbitary mazes
Spoiler: Diffusion Policy cannot solve this task
Dataset:
GCS Settings
RGB
Trinary
64x64x3 [0, 255]
52x52x1 [0, 2]
Might work for multi-maze?
Thank you to Ahbinav
for sharing this work
Similar recipe to video prediction:
1. Visually hallucinate a plan
2. Back out actions from the hallucination
??
Single Maze
Multi-Maze
Motion Planning Experiments
Planar Pushing
Sim2Real Transfer
Scaling Laws & Generalization
Brief discussion
GCS
RRT
RRT (shortcut)
RRT*
Data quality matters: diffusion is shockingly good at imitating the characteristics of the expert
Ex. RRT
RRT
Single Maze
Multi-Maze
Motion Planning Experiments
Planar Pushing
Sim2Real Transfer
Scaling Laws & Generalization
Learning Perspective
Model-based Perspective
Policy
Training
Data
Trajectory Generation
Example: Push-T Task
Policy
Training
Data
Trajectory Generation
Example: Push-T Task (with Kuka)
Data quality matters!
(at least in the low data regime)
Text
Maybe policies become less sensitive to data quality in the big data regime?
Pushing on the vertices
Pushing on the sides (more robust)
Single Maze
Multi-Maze
Motion Planning Experiments
Planar Pushing
Sim2Real Transfer
Scaling Laws & Generalization
Physics gap
Visual gap
Big data
Big transfer
Small data
No transfer
Ego-Exo
robot teleop
Simulation
Open-X
Simulated Policy
Simulated Data
Trajectory Generation
Real Policy
Real-world Data
Real-world data
Policy
Training
Data
Trajectory Generation
No guidance
With guidance
unconditional score
Cats
Dogs
unconditional score
conditional score
Cats
Dogs
Real-world data collection
Policy
(trained via CFG)
Simulated Data + Labels
Trajectory Generation
(I would also be interested in a more rigorous exploration)
Real-world data + Labels
Sim Data
Real Data
Goal: learn the distributions for both the sim and real data
Sim Data
Real Data
Goal: learn the distributions for both the sim and real data
Caveat: Might not give huge improvements... The policy could already implicitly distinguish between sim and real by the input images?
Single Maze
Multi-Maze
Motion Planning Experiments
Planar Pushing
Sim2Real Transfer
Scaling Laws & Generalization
Roughly speaking: For LLMs...
2x data => 2x model size
D = data, N = parameters, C =compute
Super useful for practitioners; however...
Instead, explore scaling and generalization at a smaller scale in simulation:
Collect a planar-pushing dataset with N objects
Questions: If the policy has learned to push n objects...
Push-T
Insertion [IBC]
Sorting [IBC]