PyRL : Reinforcement Learning
Antonin RAFFIN - Imad EL HANAFI
Understanding code



Tools
SublimeText
Git
Libraries :
MatplotLib
TKinder

http://Gitlab.ensta.fr

Steps

Step 1 :
Stupid agents : To understand the code
V value agent : Incremental and Batch
Q value agent : Incremental and Batch
Steps

Step 2 :
2D environment
Step 3 :
Temporal Differencing agent
Steps

Step 4 :
GUI environments
Tetris simple
Difficulties and solutions


- How to represent an environment ? Matrix or list ...
- Understand equations
- How to choose good rewards ? on walls, environment limit ...
- Decay rate on different environment ?
Details results and demonstrations :
2D with walls :

Tetris
Details results and demonstrations :
Learning curves :


Qvalue : 2D - 20 cells
TD : 2D - 20 cells
Details results and demonstrations :
Learning curves :
TD : 2D - static walls
TD : 2D -Moving walls


Details results and demonstrations :
Learning curves on TETRIS simple :
TD : Tetris - 100actions - 3rows
TD : Tetris -100actions - 5rows


Details results and demonstrations :
Learning curves on TETRIS simple :
TD : Tetris - 100actions - 3rows
TD : Tetris -100actions - 5rows


Details results and demonstrations :
Learning curves on TETRIS simple :
TD : Tetris - 100actions - Bad Rewards
TD : Tetris -1000actions - 5rows


Conclusion
4 weeks project
Discovering
Title Text
Pyrl
By elimpro
Pyrl
- 1,416