Slides forked from Russ Tedrake
Image credit: Boston Dynamics
Physics + optimization
DARPA Robotics Competition
2015
Robots are dancing and starting to do parkour, but...
what about something more useful, like loading the dishwasher?
(for robotics; in a few slides)
Input
Neural Network
Released in 2009
Something we couldn't have expected...
(Pre-)Training on ImageNet makes it easier to "learn" to recognize other objects
A sample annotated image from the COCO dataset
Example: Text completion
No extra "labeling" of the data required!
But it's trained on the entire internet...
And it's a really big network
Humans have also put lots of captioned images on the web
...
"A painting of a professor giving a talk at a robotics competition kickoff"
Input:
Output:
"A painting of a cool MIT professor delivering a lecture on robotics and generative AI at the Harvard-MIT Mathematics Tournament (HMMT) inside the Stata Center."
Input:
Output:
Is Dall-E just next pixel prediction?
Our engineering design process
Open source:
What can you do right now?
https://introml.mit.edu
https://slides.com/shensquared
http://manipulation.mit.edu
http://underactuated.mit.edu