Antonin Raffin
I. Reinforcement Learning 101
II. Learning to drive in minutes
III. Learning to race in hours
Controls
on-board camera
Goal: maximize sum of rewards
Is the car still on the road?
+1
-10
yes
no
RL Tips and Tricks (RLVS21): https://rl-vs.github.io/rlvs2021/
Primary
Secondary
stay on the track
smooth driving
+1 for every timestep without crash
minimize steering diff
minimal throttle
no steering / constant steering
maximize distance travelled
zig-zag driving
+1 for every step without crash + const x throttle
-10 - const x throttle (on crash)
reward =
Be careful with Markov assumption!
live demo