Towards grounding everything in language
Language
Control
Vision
Tactile
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
https://socraticmodels.github.io
Lots of data
Less data
Less data
"Language" as the glue for intelligent machines
Language
Perception
Planning
Control
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
https://socraticmodels.github.io
Some limits of "language" as intermediate representation?
- Only for high level? what about control?
Perception
Planning
Control
Socratic Models
Inner Monologue
PaLi-3, BLIP
PaLM-SayCan
Wenlong Huang et al, 2022
Chinchilla, Sparrow
Imitation? RL?
Engineered?
PaLM-E
Challenge: not a lot of paired
language + control data
Code is a linguistic representation of actions
and we have massive amounts of (pretraining) data for it
sax demo
Language models can write code
Code as a medium to express more complex plans
Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, Andy Zeng
code-as-policies.github.io
Code as Policies: Language Model Programs for Embodied Control
Language models can write code
Code as a medium to express more complex plans
Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, Andy Zeng
code-as-policies.github.io
Code as Policies: Language Model Programs for Embodied Control
SoTA on HumanEval
Language models can write code
Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, Andy Zeng
code-as-policies.github.io
Code as Policies: Language Model Programs for Embodied Control
use NumPy,
SciPy code...
Language models can write code
Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, Andy Zeng
code-as-policies.github.io
Code as Policies: Language Model Programs for Embodied Control
Language models can write code
Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, Andy Zeng
code-as-policies.github.io
Code as Policies: Language Model Programs for Embodied Control
Language models can write code
Jacky Liang, Wenlong Huang, Fei Xia, Peng Xu, Karol Hausman, Brian Ichter, Pete Florence, Andy Zeng
code-as-policies.github.io
Code as Policies: Language Model Programs for Embodied Control
What is the foundation models for robotics?
Extensions to Code as Policies
1. Fuse visual-language features into a robot map
2. Use code as policies to do various navigation tasks
"Visual Language Maps" Chenguang Huang et al., ICRA 2023
Extensions to Code as Policies
1. Fuse visual-language features into a robot map
2. Use code as policies to do various navigation tasks
"Visual Language Maps" Chenguang Huang et al., ICRA 2023
1. Write code to do visual reasoning
2. Few-shot SOTA improvements on VQA
"Modular VQA via Code Generation"
Sanjay Subramanian et al., 2023