MinMax trees - when Min can win in two steps

2019-05-06 03:35发布

问题:

So, I have been playing around with minmax trees to create a simple computer player in a two person board game. I understand the basics of the algorithm, but there is a case that eludes my turkey infused brain ... what happens when MIN can win in two steps?

Example, assume a connect4/tic-tac-toe type game where only one of the two players can own a square. How do I get MAX to occupy a square solely for the purpose of preventing MIN from getting the square?

Let's try a simplified example (shown in nifty ASCII art) where the options are Left and Right. Assume he tree is too big to traverse all the way to the terminal states, so intermediate values are calculated based on a heuristics function (marked with * below). -INF is a terminal state where MIN wins.

                MAX (a)             
                /   \
               A     B
              /       \
           MIN (b)    MIN (c)     
           /  \       /  \
          A    B     A    B
         /     |     |     \
      -INF    *5    *22    *20

MIN is going to pick action A in state (b) for a score of -INF
MIN is going to pick action B in state (c) for a score of +20
MAX is going to pick action B in state (a) for a score of +20

The problem - of course - is that if MAX picks B then MIN will perform action A (since that square is still available) and thus MIN will win. I need to get MAX to realize the value of picking action A in state (a) to prevent MIN from getting -INF in the next move.

I would put a bunch of tests into the code to check if MIN can win, but it seems to me that the algorithm should take care of this. I think I am missing a piece in the determination of the value with regards to MAX which is causing this.

(Edited to clarify)

回答1:

Each node in the minimax tree is a complete game state. When a player picks an action, the game moves to that state, restricting the actions of both players (there is no way to pick another action from another branch). So in your example, if in state (a) player Max picks action B, the game is now in state C. The only two choices for the min player at this point are A(22) and B(20). The depth of the tree does not matter; the max and min players will always pick their best action from the current state of the game.

For the game of tic-tac-toe, each state needs to be a complete board (feasible, of course). For example, the first level will be each possible place X could place their marker. Then each child of those states will be each possible place O can place, given the parent state (where X placed), etc...

Heuristics are helpful when you cannot represent the entire game tree (eg, chess), but don't change how the minimax tree is used.



回答2:

If think the problem is with the heuristics function. As you say, if MAX chooses B in state (a),

MIN will perform action A (since that square is still available) and thus MIN will win

but on the tree you mark this with *22 not -Inf as it should be (MIN wins).