Demystifying AlphaGo

David Leonard

May 3, 2016

State Space

State space of Chess

\Omega 10^{120}

\Omega 10^{120}

208,168,199,381,979,984,699,478,633,344,862,770,286,522, 453,884,530,548,425,639,456,820,927,419,612,738,015,378, 525,648,451,698,519,643,907,259,916,015,628,128,546,089, 888,314,427, 129,715,319,317,557,736,620,397,247,064,840, 935.

Number of legal Go moves

250^{150} \approx

250^{150} \approx

Terminology

Monte Carlo Tree Search

Convolutional Neural Networks

Policy Network

p(a|s)

p(a|s)

[0.1, 0.25, 0.50, 0.05, 0.10]

[1, 4, 90, 42, 18]

Value Network

v_{\theta}(s')

v_{\theta}(s')

Training

Supervised Learning of the Policy Network

Reinforcement Learning of the Policy Network

Reinforcement Learning of the Value Network

Results

Questions

Artificial Intelligence II Final Presentation

By David Leonard

Artificial Intelligence II Final Presentation

Exploring Google DeepMind's AlphaGo

3,354

Demystifying AlphaGo

State Space

State space of Chess

Number of legal Go moves

Terminology

Monte Carlo Tree Search

Monte Carlo Tree Search

Monte Carlo Tree Search

Monte Carlo Tree Search

Monte Carlo Tree Search

Convolutional Neural Networks

Convolutional Neural Networks

Convolutional Neural Networks

Policy Network

Value Network

Training

Supervised Learning of the Policy Network

Reinforcement Learning of the Policy Network

Reinforcement Learning of the Value Network

Results

Questions

Artificial Intelligence II Final Presentation

More from David Leonard