Reinforcement Learning

http://web.stanford.edu/class/cs234/index.html

What is the main difference between RL and other learning approaches?

By Yamaguchi先生 at the English language Wikipedia, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=57295504

Armed Bandits Problem

Exploration vs exploitation?

Markov Decision Process

Why MDP and POMDP?

source: wiki

Grid World

Discrete vs Continuous states

https://mpatacchiola.github.io/blog/

Bellman Equation

https://dnddnjs.gitbooks.io/

Credit Assignment?

Algorithms

Multi-Agent RL

Temporal Credit Assignment?