Return bestMove in minimax algorithm for tictactoe

2019-02-02 21:23发布

I have tried to code the minimax algorithm for tic-tac-toe given in Russel Norvig's book on Artificial Intelligence. It had everything except that the way to return the bestMove to the user. I am trying hard to return the bestMove, but cannot decide when to choose the bestMove. Help, anyone?

moveT MiniMax(stateT state)
{
    moveT bestMove;

    max_move(state,bestMove);

    return bestMove;

}

int max_move(stateT state,int & bestMove)
{
    int v = -10000;
    if(GameIsOver(state))
    {
        return EvaluateStaticPosition(state);

    }

    vector<moveT> moveList;
    GenerateMoveList(state, moveList);
    int nMoves = moveList.size();

    for(int i = 0 ; i < nMoves ; i++)
    {
        moveT move = moveList[i];
        MakeMove(state, move);

        int curValue = min_move(state,bestMove);

            if(curValue > v)
            {
              v = curValue;
              bestMove = move;
            }
        RetractMove(state, move);

    }

    return v;

}

int min_move(stateT state, int &bestMove)
{
    int v = 10000;
    if(GameIsOver(state))
    {
      return EvaluateStaticPosition(state);

    }
    vector<moveT> moveList;
    GenerateMoveList(state, moveList);

    int nMoves = moveList.size();

    for(int i = 0 ; i < nMoves; i++)
    {
        moveT move = moveList[i];
        MakeMove(state, move);

        int curValue = max_move(state,depth+1,bestMove);

            if(curValue < v)
            {
              curValue = v;
            }
        RetractMove(state, move);

    }
    return v;
}

P.S.: There are other pseudocode for finding the minmax value. However, they are focused on tic-tac-toe only, I am trying to extend it to other games. Thanks.

Update : The whole code can be found here : http://ideone.com/XPswCl

3条回答
男人必须洒脱
2楼-- · 2019-02-02 21:27

Well, it looks like MiniMax correctly chooses it for you, just call it with an initial state and a depth. (Unless the first player according to the state is the second player, then you should call min_move in MiniMax.)

EDIT: yes, I overlooked something, bestMove currently does not make much sense. In the program within max_move you change the loop like this:

for(int i = 0 ; i < nMoves ; i++)
{
    moveT move = moveList[i];
    MakeMove(state, move);

    int new_value = min_move(state, depth+1);
    if(new_value > v)
    {
      v=new_value;
    }
    RetractMove(state, move);

}

After that you can think about you what bestMove means? My idea is that you are interested in finding one of the "best possible" series of moves for tic-tac-toe. For that you need a vector or even better a stack. But that also means having std::stack<int>* best_moves as the last parameter.

For the stack implementation, in min_move you return the next moves and if their value is the best, you will push your move on the top of the best_moves stack. Of course at the end of the game you just return the empty stack. It takes an OOP approach to pull it off properly, I'll do it when I have some time.

If all you need is merely the best next move then I suggest you change the return types of min_move and max_moe to some struct like this:

struct Value_move{
  int value;
  moveT best_move;
};

Then the new implementation of max_move looks like the following:

const int MOVE_INVALID = -12345;
const int MOVE_NOTHING = -12346;

Value_move max_move(stateT state, int depth)
{
    Value_move best;
    best.value = -10000; best.best_move = MOVE_INVALID;

    if(GameIsOver(state))
    {
        best.value = EvaluateStaticPosition(state);
        best.best_move = MOVE_NOTHING;
        return best;
    }

    vector<moveT> moveList;
    GenerateMoveList(state, moveList);
    int nMoves = moveList.size();

    for(int i = 0 ; i < nMoves ; i++)
    {
        moveT move = moveList[i];
        MakeMove(state, move);
        Value_move curr = min_move(state, depth+1);
        if(curr.value > best.value)
        {
            best.value = curr.value;
            best.best_move = move;
        }
        RetractMove(state, move);

    }

    return v;

}

You only need to pick up the best_move field in the returned struct in the MiniMax function.

REMARK:
You have to admit though this does not resemble a c++ program in many aspects but rather a c program. Otherwise, all the functions in CapitalCamelCase should be class methods, you should pass states by (const) ref instead of value -- but this whole code makes sense only if the status is really a pointer behind a typedef.

查看更多
Ridiculous、
3楼-- · 2019-02-02 21:39

In the simplest version of minimax, the first player wishes to maximize his score, and the second player wishes to minimize the first player's score. Since both first and second player only care about the first player's score, EvaluateStaticPosition should return a value indicating how good the board state is for the first player. Whose turn it is is not relevant.

int EvaluateStaticPosition(stateT state)
{
        if(CheckForWin(state, FIRST_PLAYER))
        {
                return WINNING_POSITION;
        } 
        if(CheckForWin(state, Opponent(FIRST_PLAYER)))
        {
                return LOSING_POSITION;
        } 
        return NEUTRAL_POSITION;
}

Now, when you want the move that's best for the first player, call MaxMove. When you want the move that's best for the second player, call MinMove.

moveT MiniMax(stateT state)
{
    moveT bestMove;
    int i = 0;
    if (state.whoseTurn == FIRST_PLAYER){
        i = MaxMove(state, bestMove);
    }
    else{
        i = MinMove(state,bestMove);
    }
    cout<<"i is "<<i<<endl;
    return bestMove;
}

Finally, you have some problems inside of MinMove and MaxMove. when you assign curRating in either one, you shouldn't pass in bestMove as the second argument to MaxMove or MinMove. It will then put the opponent's best move into bestMove, which doesn't make sense. Instead, declare an opponentsBestMove object and pass that as the second argument. (You won't actually be using the object or even looking at its value afterwards, but that's ok). With that change, you never assign anything to bestMove within MinMove, so you should do so inside the if(curRating < v) block.

int MaxMove(stateT state, moveT &bestMove)
{
        if(GameIsOver(state))
        {
            return EvaluateStaticPosition(state);
        }
        vector<moveT> moveList;
        GenerateMoveList(state, moveList);
        int nMoves = moveList.size();
        int v = -1000;
        for(int i = 0 ;i<nMoves; i++)
        {
                moveT move = moveList[i];
                MakeMove(state, move);
                moveT opponentsBestMove;
                int curRating = MinMove(state, opponentsBestMove);
                if (curRating > v)
                {
                        v = curRating;
                        bestMove = move;
                }
                RetractMove(state, move);
        }
        return v;

}
int MinMove(stateT state,  moveT &bestMove)
{
        if(GameIsOver(state))
        {
                return EvaluateStaticPosition(state);
        }
        vector<moveT>moveList;
        GenerateMoveList(state, moveList);
        int nMoves = moveList.size();
        int v = 1000;
        for(int i = 0 ; i<nMoves; i++)
        {
                moveT move = moveList[i];
                MakeMove(state , move);
                moveT opponentsBestMove;
                int curRating = MaxMove(state,opponentsBestMove);
                if(curRating < v)
                {
                        v = curRating;
                        bestMove = move;
                }
                RetractMove(state, move);
        }
        return v;
}

At this point you should have an unbeatable AI!

The final position looks like this:

 O | O | X
---+---+---
 X | X | O
---+---+---
 O | X | X

Cat's game.

An alternative method takes advantage of the fact that tic-tac-toe is a zero-sum game. In other words, at the end of the game, the sum of the scores of the players will equal zero. For a two player game, this means that one player's score will always be the negative of the other player's. This is convenient for us, since minimizing the other player's score is then identical to maximizing one's own score. So instead of one player maximizing his score and one player minimizing the other player's score, we can just have both players attempt to maximize their own score.

Change EvaluateStaticPosition back to its original form, so that it gives a score based on how good the board state is for the current player.

int EvaluateStaticPosition(stateT state)
{
        if(CheckForWin(state, state.whoseTurn))
        {
                return WINNING_POSITION;
        }
        if(CheckForWin(state, Opponent(state.whoseTurn)))
        {
                return LOSING_POSITION;
        }
        return NEUTRAL_POSITION;
}

Delete MinMove, since we only care about maximizing. Rewrite MaxMove so that it chooses the move that gives the opponent the worst possible score. The score for the best move is the negative of the other player's worst score.

int MaxMove(stateT state, moveT &bestMove)
{
        if(GameIsOver(state))
        {
                return EvaluateStaticPosition(state);
        }
        vector<moveT> moveList;
        GenerateMoveList(state, moveList);
        int nMoves = moveList.size();
        int v = -1000;
        for(int i = 0 ;i<nMoves; i++)
        {
                moveT move = moveList[i];
                MakeMove(state, move);
                moveT opponentsBestMove;
                int curRating = -MaxMove(state, opponentsBestMove);
                if (curRating > v)
                {
                        v = curRating;
                        bestMove = move;
                }
                RetractMove(state, move);
        }
        return v;

}

Since MaxMove is used for both players, we no longer need to distinguish among players in the MiniMax function.

moveT MiniMax(stateT state)
{
    moveT bestMove;
    int i = 0;
    i = MaxMove(state, bestMove);
    cout<<"i is "<<i<<endl;
    return bestMove;
}
查看更多
萌系小妹纸
4楼-- · 2019-02-02 21:42

Your code finds the correct value but then overwrites it by passing the same reference down.

int curValue = min_move(state,bestMove);

should become

moveT nextMove; // No need to actually do anything with this value
int curValue = min_move(state,nextMove);

You also need to make the same kind of change in your min_move function.

NB: in min_move your code calls max_move with more arguments than you've defined for the function.

查看更多
登录 后发表回答