PyRL : Reinforcement Learning

with Python

Antonin RAFFIN - Imad EL HANAFI

Understanding code

Tools

SublimeText

Git

Libraries

MatplotLib

TKinter

http://gitlab.ensta.fr

Steps

Step 1: Improving agents

Stupid agent : To understand the code

V value agent : Incremental and Batch

Q value agent : Incremental and Batch

Steps

Step 2: new environment

Step 3: a better agent

2D environment

Temporal Differencing agent

Steps

GUI for environments

Variations of 2D grid environment

Tetris simple

Step 4: new environments and GUI

Difficulties and solutions

Decay rate on different environment ?

How to represent an environment ? Matrix or list ...

Understand equations

How to choose good rewards ? on walls, environment limits ...

2D with walls

Tetris

Results

GUI

Comparing agents

Qvalue : 2D - 20 cells

TD : 2D - 20 cells

Results

Comparing environments

TD : 2D - static walls

TD : 2D -Moving walls

Results

Learning curves on TETRIS simple

TD : Tetris - 100 actions - 3 columns

TD : Tetris -100 actions - 5 columns

Results

More informations taken into account

TD : Tetris - state base on 3 rows

TD : Tetris - state base on 4 rows

Results

Results

Different rewards

TD : Tetris - constant Bad Reward

TD : Tetris - reward base on the action (good/bad choice)

Conclusion

Discovering RL

First medium project with another person

Deepening knowledge of using Python

Working in english

IN104: PyRL- Reinforcement Learning with Python

By Antonin Raffin

IN104: PyRL- Reinforcement Learning with Python

Project of Reinforcement Learning with Python

1,636

Antonin Raffin