PyRL : Reinforcement Learning

Antonin RAFFIN - Imad EL HANAFI

Understanding code

Tools

SublimeText

Git

Libraries :

MatplotLib

TKinder

http://Gitlab.ensta.fr

Steps

Step 1 :

Stupid agents : To understand the code

V value agent : Incremental and Batch

Q value agent : Incremental and Batch

Steps

Step 2 :

2D environment

Step 3 :

Temporal Differencing agent

Steps

Step 4 :

GUI environments

Tetris simple

Difficulties and solutions

How to represent an environment ? Matrix or list ...
Understand equations
How to choose good rewards ? on walls, environment limit ...
Decay rate on different environment ?

Details results and demonstrations :

2D with walls :

Tetris

Details results and demonstrations :

Learning curves :

Qvalue : 2D - 20 cells

TD : 2D - 20 cells

Details results and demonstrations :

Learning curves :

TD : 2D - static walls

TD : 2D -Moving walls

Details results and demonstrations :

Learning curves on TETRIS simple :

TD : Tetris - 100actions - 3rows

TD : Tetris -100actions - 5rows

Details results and demonstrations :

Learning curves on TETRIS simple :

TD : Tetris - 100actions - 3rows

TD : Tetris -100actions - 5rows

Details results and demonstrations :

Learning curves on TETRIS simple :

TD : Tetris - 100actions - Bad Rewards

TD : Tetris -1000actions - 5rows

Conclusion

4 weeks project

Discovering

PyRL : Reinforcement Learning

Tools

Steps

Steps

Steps

Difficulties and solutions

Details results and demonstrations :

Details results and demonstrations :

Details results and demonstrations :

Details results and demonstrations :

Details results and demonstrations :

Details results and demonstrations :

Conclusion

Title Text