PyRL : Reinforcement Learning
Antonin RAFFIN - Imad EL HANAFI
Understanding code
Tools
SublimeText
Git
Libraries :
MatplotLib
TKinder
http://Gitlab.ensta.fr
Steps
Step 1 :
Stupid agents : To understand the code
V value agent : Incremental and Batch
Q value agent : Incremental and Batch
Steps
Step 2 :
2D environment
Step 3 :
Temporal Differencing agent
Steps
Step 4 :
GUI environments
Tetris simple
Difficulties and solutions
- How to represent an environment ? Matrix or list ...
- Understand equations
- How to choose good rewards ? on walls, environment limit ...
- Decay rate on different environment ?
Details results and demonstrations :
2D with walls :
Tetris
Details results and demonstrations :
Learning curves :
Qvalue : 2D - 20 cells
TD : 2D - 20 cells
Details results and demonstrations :
Learning curves :
TD : 2D - static walls
TD : 2D -Moving walls
Details results and demonstrations :
Learning curves on TETRIS simple :
TD : Tetris - 100actions - 3rows
TD : Tetris -100actions - 5rows
Details results and demonstrations :
Learning curves on TETRIS simple :
TD : Tetris - 100actions - 3rows
TD : Tetris -100actions - 5rows
Details results and demonstrations :
Learning curves on TETRIS simple :
TD : Tetris - 100actions - Bad Rewards
TD : Tetris -1000actions - 5rows
Conclusion
4 weeks project
Discovering
Title Text
Pyrl
By elimpro
Pyrl
- 1,276