FAI · Adverserial Search and Games

Adversarial Search
& Games

What kinds of games are we playing?

No randomness: no rolls of die or random shuffles.

Two players and turn-based.

Zero sum.

Perfect information.

Ingredients of a game

Board and pieces (the setup).

The rules describing possible moves.

A win condition.

Want algorithms for calculating a strategy which
recommends a move from each state.

Ingredients of a game

A number \(N\), initially 0.

Add any number up to 10 to the current number.

The player who reaches the state 100 wins.

Want algorithms for calculating a strategy which
recommends a move from each state.

Reach 100

Board and pieces (the setup).

The rules describing possible moves.

A win condition.

Ingredients of a game

Boolean Formula Game

Binary Geography

Let's look at some other examples.

What does it mean to solve a game?

Fully solve a game.
No matter what position you are in,
you know if you can force a win or a draw, and how.

Weakly solve a game.
Both the result and a strategy for achieving it
from the start of the game are known.

Ultra-weakly solve a game.
The perfect-play result, but not a strategy for achieving that value,
is known, like in Hex or Chomp.

Ingredients of a game

\(S_0\) : The initial state, which specifies how the game is set up at the start.

Player\((s)\): Defines which player has the move in a state.

Actions\((s)\) : Returns the set of legal moves in a state.

\(\operatorname{Result}(s, a)\) : The transition model, which defines the result of a move.

\(\operatorname{Terminal-Test}(s)\) : A terminal test, which is true when the game is over and false otherwise. States where the game has ended are called terminal states.

\(\operatorname{Utility}(s, p)\): A utility function, defines the final numeric value for a game that ends in terminal state \(s\) for a player \(p\).

Ingredients of a game

Traditionally: min v. max (2 players)

The min player is trying to minimize the utility.

The max player is trying to maximize the utility.

Zero-sum: utilities sum to zero and one player's gain is the other's loss

A Game Tree

12

11

10

9

10

9

8

7

6

9

8

7

state

4

3

2

1

2

0

1

0

1

0

1

0

state

4

3

2

1

2

0

1

0

1

0

1

0

-1

1

-1

1

-1

1

-1

1

utilities

The Min-Max Algorithm

Source: Artificial Intelligence: A Modern Approach, Fourth Edition. Russell and Norvig

This definition of optimal play for MAX assumes that MIN also plays optimally.

What if MIN does not play optimally?

Then MAX will do at least as well as against an optimal player, possibly better.

However, that does not mean that it is always best to play the minimax optimal move when facing a suboptimal opponent.

The minimax algorithm performs a complete depth-first exploration of the game tree.

If the maximum depth of the tree is \(m\) and
there are \(b\) legal moves at each point,
then the time complexity of the minimax algorithm is \(O\left(b^m\right)\).

The space complexity is \(O(m)\), proportional to the depth of the tree.

The exponential complexity makes Minimax impractical;
for example, chess has a branching factor of about 35
and the average game has depth of about 80 moves,
and it is not feasible to search \(35^{80} \approx 10^{123}\) states.

\([-\infty,+\infty]\)

\(3\)

\([-\infty,+\infty]\)

\(3\)

???

\([-\infty,+\infty]\)

\([-\infty,3]\)

\(3\)

\([-\infty,+\infty]\)

\([-\infty,3]\)

\(3\)

\(12\)

\([-\infty,+\infty]\)

\([-\infty,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

???

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,+\infty]\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,+\infty]\)

\(2\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,+\infty]\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,+\infty]\)

\(14\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,14]\)

\(14\)

\([3,+\infty]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,14]\)

\(14\)

???

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,14]\)

\(14\)

\([3,14]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,14]\)

\(14\)

\(?\)

\([3,14]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,14]\)

\(14\)

\(5\)

\([3,14]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,5]\)

\(14\)

\(5\)

\([3,5]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,5]\)

\(14\)

\(5\)

\([3,5]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([-\infty,5]\)

\(14\)

\(5\)

\(2\)

\([3,5]\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([2,2]\)

\(14\)

\(5\)

\(2\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([2,2]\)

\(14\)

\(5\)

\(2\)

\([3,3]\)

\(3\)

\(12\)

\(8\)

\([-\infty,2]\)

\(2\)

\([2,2]\)

\(14\)

\(5\)

\(2\)

Does the order of discovery matter?

Source: Artificial Intelligence: A Modern Approach, Fourth Edition. Russell and Norvig

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow \bot\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow \bot\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow \bot\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\(v \longleftarrow +\infty\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\(v \longleftarrow +\infty\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow \bot\)

\(a2 \longleftarrow \bot\)

\(v \longleftarrow +\infty\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow +\infty\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow +\infty\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow 3\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow 3\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 8\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow 3\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 12\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow 3\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 12\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow 3\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 12\)

\(a2 \longleftarrow \star\)

\(v \longleftarrow 3\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 12\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow -\infty\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([-\infty,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow +\infty\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow +\infty\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 3\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow +\infty\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow +\infty\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow +\infty\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow 2\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow 2\)

\([3,2]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow 2\)

\([3,2]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\(v \longleftarrow 2\)

\([3,2]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,2]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,2]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,2]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,2]\)

\([3,+\infty]\)

\([\alpha,\beta]\)

\(v \longleftarrow 3\)

\(v2 \longleftarrow 2\)

\(a2 \longleftarrow \star\)

\([-\infty,3]\)

\([3,2]\)

Exercise: complete the last branch.

Initialization

Other Considerations

Multiplayer Games

Food for thought: collusion/alliances, non zero-sum

payoff vectors: \(\langle u_A, u_B, u_C \rangle\)

Source: Artificial Intelligence: A Modern Approach, Fourth Edition. Russell and Norvig

Heuristic Alpha-Beta Tree Search

A heuristic evaluation function \(\operatorname{EVAL}(s, p)\) returns an estimate of the expected utility of state \(s\) to player \(p\).

For terminal states, it must be that

\(\operatorname{EVAL}(s, p)=\operatorname{UTILITY}(s, p)\).

For nonterminal states, the evaluation must be somewhere between a loss and a win:
\(\operatorname{UTILITY}(\)loss, \(p\)) \(\leq\) \(\operatorname{EVAL}(s, p) \leq \operatorname{UTILITY}(\) win,\(p)\).

Other Considerations

Source: Artificial Intelligence: A Modern Approach, Fourth Edition. Russell and Norvig

Other Considerations

Heuristic Alpha-Beta Tree Search

Source: Artificial Intelligence: A Modern Approach, Fourth Edition. Russell and Norvig

Other Considerations

Monte-Carlo Tree Search

Games of Chance

Partially Observable Games

FAI · Adverserial Search and Games

More from Neeldhara Misra