Reinforcement Learning

http://web.stanford.edu/class/cs234/index.html

Reinforcement Learning

What is the main difference between RL and other learning approaches?

By Yamaguchi先生 at the English language Wikipedia, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=57295504

Armed Bandits Problem

Exploration vs exploitation?

Markov Decision Process

Why MDP and POMDP?

source: wiki

Grid World

Discrete vs Continuous states

https://mpatacchiola.github.io/blog/

Bellman Equation

https://dnddnjs.gitbooks.io/

Credit Assignment?

  • TD-learning
  • Q-learning
  • SARSA
  • LSTD
  • LSPI
  • Actor-Critic (Policy Gradient)
  • Deep-Reinforcement Learning

Algorithms

Multi-Agent RL

Temporal Credit Assignment?

Made with Slides.com