Demystifying AlphaGo

David Leonard

May 3, 2016

State Space

State space of Chess

\Omega 10^{120}
Ω10120\Omega 10^{120}

208,168,199,381,979,984,699,478,633,344,862,770,286,522, 453,884,530,548,425,639,456,820,927,419,612,738,015,378, 525,648,451,698,519,643,907,259,916,015,628,128,546,089, 888,314,427, 129,715,319,317,557,736,620,397,247,064,840, 935.

Number of legal Go moves

250^{150} \approx
250150250^{150} \approx

Terminology

Monte Carlo Tree Search

Monte Carlo Tree Search

Monte Carlo Tree Search

Monte Carlo Tree Search

Monte Carlo Tree Search

Convolutional Neural Networks

Convolutional Neural Networks

Convolutional Neural Networks

Policy Network

p(a|s)
p(as)p(a|s)

[0.1, 0.25, 0.50, 0.05, 0.10]

[1, 4, 90, 42, 18]

Value Network

v_{\theta}(s')
vθ(s)v_{\theta}(s')

Training

Supervised Learning of the Policy Network

Reinforcement Learning of the Policy Network

Reinforcement Learning of the Value Network

Results

Questions

Artificial Intelligence II Final Presentation

By David Leonard

Artificial Intelligence II Final Presentation

Exploring Google DeepMind's AlphaGo

  • 3,354