Go
Chess
Starcraft
Complexity
"Math", "Machine Learning"
Fucking DeepMind White Papers
Decision Trees
-0.8
4.3
1.4
-2.6
-2.1
-0.8
State of the art: Alpha-beta pruning & MiniMax*
*These techniques only work if you can properly evaluate the board position
5.0
3.2
-1.3
-1.9
-4.8
-3.2
1.2
-0.9
-1.3
How does AlphaZero Work?
Each Board state will have a # times won from that position (exploit), and # times visited (explore)
0. Start at the top 1. Pick the next move that has the highest score, calculated from the explore and exploit numbers 2. Keep going until we reach a state we haven't seen before 3. Add each legal move to the tree, and play out X random games from that position 4. Update the explore and exploit numbers back up the tree, 5. Goto (0)
These will be used to figure out a score. Low explore is very good, High exploit is good.
right?!
a crash course
Good at "fuzzy" recognition.
Needs to be supervised.
It's just a lot of multiplying and adding TBH
It's good. Like, scary good.
Hidden information, real-time
No defined rules, sloppy inputs
Positive-sum games, co-operation, communication
✅ Beat world champions (dec 2018)
✅ Outperformed state-of-the-art algos (mar 2020)
❓ Work started in June, 2020
Sorry, I put this together super fast...