PyRL : Reinforcement Learning
with Python
Antonin RAFFIN - Imad EL HANAFI
Understanding code
SublimeText
Git
Libraries
MatplotLib
TKinter
http://gitlab.ensta.fr
Step 1: Improving agents
Stupid agent : To understand the code
V value agent : Incremental and Batch
Q value agent : Incremental and Batch
Step 2: new environment
Step 3: a better agent
2D environment
Temporal Differencing agent
GUI for environments
Variations of 2D grid environment
Tetris simple
Step 4: new environments and GUI
Decay rate on different environment ?
How to represent an environment ? Matrix or list ...
Understand equations
How to choose good rewards ? on walls, environment limits ...
2D with walls
Tetris
GUI
Comparing agents
Qvalue : 2D - 20 cells
TD : 2D - 20 cells
Comparing environments
TD : 2D - static walls
TD : 2D -Moving walls
Learning curves on TETRIS simple
TD : Tetris - 100 actions - 3 columns
TD : Tetris -100 actions - 5 columns
More informations taken into account
TD : Tetris - state base on 3 rows
TD : Tetris - state base on 4 rows
Different rewards
TD : Tetris - constant Bad Reward
TD : Tetris - reward base on the action (good/bad choice)
Discovering RL
First medium project with another person
Deepening knowledge of using Python
Working in english