PRM-RL: Long-range Robotic Navigation Tasks by Combining Reinforcement Learning and Sampling-based Planning


Aleksandra Faust, Oscar Ramirez, Marek Fiser, Kenneth Oslund, Anthony Francis, James Davidson, and Lydia Tapia


Presented by Breandan Considine

Workspace

Configuration space

Probabilistic roadmaps

Advantages

 

  • Very efficient planning representation
  • Provably probabilistically complete
  • Guaranteed to find a path if one exists, given enough samples

Disadvantages

  • Executing reference trajectories
  • Typically unaware of task constraints
  • Can suffer from noise in perception and motor control
  • Does not perform well under uncertainty

RL Planners

Advantages

  • Robust to noise and errors
  • Can obey robot dynamics and other task constraints
  • Handles moderate changes to the environment
  • Not as computationally complex to execute as other approaches (e.g. MPC, action filtering, hierarchical policy approximation)

 

Disadvantages

  • Long range navigation on complex maps has a sparse reward structure
  • Can be difficult to train
  • Prone to converge on poor local minima
  • Need to carefully select the control and action space

How can we synthesize these two approaches?

How do we train the dynamics model?

Made with Slides.com