prelude to Reinforcement Learning
MIT 6.800/6.843:
Robotic Manipulation
Fall 2021, Lecture 17
Follow live at https://slides.com/d/yOA4Yeo/live
(or later at https://slides.com/russtedrake/fall21-lec17)
http://www.ai.mit.edu/projects/leglab/robots/robots.html
How do you represent your motions?
(continuous time/state/action)
Task-level: Planning
(discrete/symbolic)
policy needs to know
state of the robot x state of the environment
Levine*, Finn*, Darrel, Abbeel, JMLR 2016
System
State-space
Auto-regressive (eg. ARMAX)
input
output
state
noise/disturbances
parameters
from hand-coded policies in sim
and teleop on the real robot
Standard "behavior-cloning" objective + data augmentation
"push box"
"flip box"
Policy is a small LSTM network (~100 LSTMs)
https://learning-from-play.github.io/
https://roboturk.stanford.edu/
By russtedrake
MIT Robotic Manipulation Fall 2020 http://manipulation.csail.mit.edu
Roboticist at MIT and TRI