Bayesian Optimization for Robotics
Roberto Calandra
Facebook AI Research
JSM - 01 July 2019
Goals of the talk
- Explain some of the challenges in Robotics
-
Present multiple successful applications of BO in Robotics:
- Learning to walk with a bipedal robot
- Multi-objective BO for navigation with micro-robots
- Hierarchical BO for joint morphology/controller optimization
- Argue why BO is a powerful tool for Robotics
State-of-the-art in Robotics
From YouTube: https://www.youtube.com/watch?v=g0TaYhjpOfo
Why Learning?
Robotics still heavily rely on human expertise !
On one hand, it is unfeasible to hand-design general purpose controllers
- Human design is time-consuming and rely on prior expertise
- Real-world experiments are expensive and stochastic
On the other hand, there is mistrust for automatic design of controllers
- Not verifiable
- Often find qualitatively different solutions
- (Maybe a bit of human presumption)
Bayesian Optimization for Policy Search
Policy (i.e., parametrized controller)
Action executed
Learning a controller is equivalent to optimizing the parameters of the controller
Current state
Parameters of the policy
- 0-order
- Stochastic
- Expensive evaluation
Learning to Walk with a Bipedal Robot


Bio-inspired Bipedal Robot "Fox":
- Quasi-passive dynamic walker
- 4 Degrees of freedom
- Springs in legs
- Walking in circle
- Finite-state-machine controller (from biomechanics)
- 8 open parameters
- (Motors life ~200 trials)
[Calandra, R.; Seyfarth, A.; Peters, J. & Deisenroth, M. P. Bayesian Optimization for Learning Gaits under Uncertainty Annals of Mathematics and Artificial Intelligence (AMAI), 2015, 76, 5-23]
Learning to Walk in 80 Trials
Learned model


Not Symmetrical (about 5° difference). Why?
Because it is walking in a circle!
Locomotion as Multi-objective Optimization

Micro-robots


Simulated hexapod:
- 12 Degrees of Freedom (2 per legs)
- No good physics models at that scale
- we use Central Pattern Generators (CPG) as controllers
Question: can we move beyond standard single-objective BO?
[Yang, B.; Wang, G.; Calandra, R.; Contreras, D.; Levine, S. & Pister, K. Learning Flexible and Reusable Locomotion Primitives for a Microrobot IEEE Robotics and Automation Letters (RA-L), 2018, 3, 1904-1911]
Hard-coded CPG Gaits

Single-objective




Dual Tripod Gait
Multi-objective




Comparison Gaits

Discovering New Gaits

Contextual BO

Learning Locomotion Primitives


- With 50 trials for each of the 5 goal targets, we can learn fairly accurate model
- The trick was to consider it a contextual BO at training time, and then convert to MOO
Combining Primitives for Navigation
Joint Morphology/Controller Optimization
- In Robotics, there is a tight relationship between morphologies and controllers
- Design of morphologies is a complex and time-consuming process
- Can we automate it?
- Same simulated hexapod as before:
- Each manufacturing round takes about 1 month in real-world...
- ...But we can fabricate multiple different morphology configurations at once (up to 5)
[Liao, T.; Wang, G.; Yang, B.; Lee, R.; Pister, K.; Levine, S. & Calandra, R. Data-efficient Learning of Morphology and Controller for a Microrobot IEEE International Conference on Robotics and Automation (ICRA), 2019]


Hierarchical Process Constrained Batch Bayesian Optimization (HPC-BBO)
Two levels of optimization
(instead of a single bigger optimization)
- Allows to weight the different cost of the two types of parameters
- Each of the two levels uses information from the other level:
- The morphology level consider the best policy achieved for each morphology design
- The controller level uses the morphology as context
- Batch evaluation to reduce fabrication time

Results



Top 4 Morphologies


- Exchanging the morphology severely degrade the controller performance.
- This evidence supports the hypothesis that morphology and controller are tightly coupled
Collaborators








Summary
- Gave a glimpse into some challenges of Robotics
-
Shown 3 successful application of BO in Robotics:
- Learning to walk with the bipedal robot "Fox"
- Multi-objective BO for navigation with micro-robots
- Hierarchical BO for joint morphology/controller optimization
- BO is a powerful tool for automatic controller tuning
- Learned models provide useful insight!

Future challenges:
- Safe optimization
- Exploiting more structure from classic control
- Higher-dimensional parameters space
Thank you for your time


References
- Calandra, R.; Seyfarth, A.; Peters, J. & Deisenroth, M. P.
Bayesian Optimization for Learning Gaits under Uncertainty
Annals of Mathematics and Artificial Intelligence (AMAI), 2015, 76, 5-23 - Bansal, S.; Calandra, R.; Xiao, T.; Levine, S. & Tomlin, C. J.
Goal-Driven Dynamics Learning via Bayesian Optimization
IEEE Conference on Decision and Control (CDC), 2017, 5168-5173 - Yang, B.; Wang, G.; Calandra, R.; Contreras, D.; Levine, S. & Pister, K.
Learning Flexible and Reusable Locomotion Primitives for a Microrobot
IEEE Robotics and Automation Letters (RA-L), 2018, 3, 1904-1911 -
Liao, T.; Wang, G.; Yang, B.; Lee, R.; Pister, K.; Levine, S. & Calandra, R.
Data-efficient Learning of Morphology and Controller for a Microrobot
IEEE International Conference on Robotics and Automation (ICRA), 2019
Learning Curve - Fox

Gait Learning
Bayesian Optimization for Robotics
By Roberto Calandra
Bayesian Optimization for Robotics
Designing and tuning controllers for real-world robots is a daunting task which typically requires significant expertise and lengthy experimentation. Bayesian optimization has shown to be a successful approach to automate these tasks with little human expertise required. In this talk, I will discuss the main challenges of robot learning, and how BO helps to overcome some of them. Using as showcase real-world applications where BO proved to be effective, I will also discuss how the challenges encountered in robotics applications can guide the development of new BO algorithms.
- 1,665