How to implement efficient Alpha-Beta pruning Game

2019-04-02 18:32发布

问题:

I'm trying to learn about artificial intelligence and how to implement it in a program. The easiest place to start is probably with simple games (in this case Tic-Tac-Toe) and Game Search Trees (recursive calls; not an actual data structure). I found this very useful video on a lecture about the topic.

The problem I'm having is that the first call to the algorithm is taking an extremely long amount of time (about 15 seconds) to execute. I've placed debugging log outputs throughout the code and it seems like it is calling parts of the algorithm an excessive amount of times.

Here's the method for choosing the best move for the computer:

    public Best chooseMove(boolean side, int prevScore, int alpha, int beta){
    Best myBest = new Best(); 
    Best reply;

    if (prevScore == COMPUTER_WIN || prevScore == HUMAN_WIN || prevScore == DRAW){
        myBest.score = prevScore;
        return myBest;
    }

    if (side == COMPUTER){
        myBest.score = alpha;
    }else{
        myBest.score = beta;
    }
    Log.d(TAG, "Alpha: " + alpha + " Beta: " + beta + " prevScore: " + prevScore);
    Move[] moveList = myBest.move.getAllLegalMoves(board);
    for (Move m : moveList){
        String choice;
        if (side == HUMAN){
            choice = playerChoice;
        }else if (side == COMPUTER && playerChoice.equals("X")){
            choice = "O";
        }else{
            choice = "X";
        }
        Log.d(TAG, "Current Move: column- " + m.getColumn() + " row- " + m.getRow());
        int p = makeMove(m, choice, side);
        reply = chooseMove(!side, p, alpha, beta);
        undoMove(m);
        if ((side == COMPUTER) && (reply.score > myBest.score)){
            myBest.move = m;
            myBest.score = reply.score;
            alpha = reply.score;
        }else if((side == HUMAN) && (reply.score < myBest.score)){
            myBest.move = m;
            myBest.score = reply.score;
            beta = reply.score;
        }//end of if-else statement
        if (alpha >= beta) return myBest;
    }//end of for loop
    return myBest;
}

Where the makeMove method makes the move if the spot is empty and returns a value (-1 - human win, 0 - draw, 1 - computer win, -2 or 2 - otherwise). Though I believe the error may be in the getAllLegalMoves method:

    public Move[] getAllLegalMoves(String[][] grid){
    //I'm unsure whether this method really belongs in this class or in the grid class, though, either way it shouldn't matter.
    items = 0;
    moveList = null;
    Move move = new Move();

    for (int i = 0; i < 3; i++){
        for(int j = 0; j < 3; j++){
            Log.d(TAG, "At Column: " + i + " At Row: " + j);
            if(grid[i][j] == null || grid[i][j].equals("")){
                Log.d(TAG, "Is Empty");
                items++;
                if(moveList == null || moveList.length < items){
                    resize();
                }//end of second if statement
                move.setRow(j);
                move.setColumn(i);
                moveList[items - 1] = move;
            }//end of first if statement
        }//end of second loop
    }//end of first loop
    for (int k = 0; k < moveList.length; k++){
        Log.d(TAG, "Count: " + k + " Column: " + moveList[k].getColumn() + " Row: " + moveList[k].getRow());
    }
    return moveList;
}

private void resize(){
    Move[] b = new Move[items];
    for (int i = 0; i < items - 1; i++){
        b[i] = moveList[i];
    }
    moveList = b;
}

To sum it all up: What's causing my call, to choose the best move, to take so long? What am I missing? Is there an easier way to implement this algorithm? Any help or suggestions will be greatly appreciated, thanks!

回答1:

A minimax tree with alpha beta pruning should be visualized as a tree, each node of the tree being a possible move that many turns into the future, and its children being all the moves that can be taken from it.

To be as fast as possible and guarantee you'll only need space linear on number of moves you're looking ahead, you do a depth first search and 'sweep' from one side to another. As in, if you imagine the whole tree being constructed, your program would actually only construct a single strand from lead to root one at a time, and discard any parts of it it is done with.

I'm just going to copy the wikipedia pseudo code at this point because it's really, really succinct and clear:

function alphabeta(node, depth, α, β, Player)         
    if  depth = 0 or node is a terminal node
        return score
    if  Player = MaxPlayer
        for each child of node
            α := max(α, alphabeta(child, depth-1, α, β, not(Player) ))     
            if β ≤ α
                break                             (* Beta cut-off *)
        return α
    else
        for each child of node
            β := min(β, alphabeta(child, depth-1, α, β, not(Player) ))     
            if β ≤ α
                break                             (* Alpha cut-off *)
        return β

Notes:

-'for each child of node' - Rather than editing the state of the current board, create an entirely new board that is the result of applying the move. By using immutable objects, your code will be less prone to bugs and quicker to reason about in general.

-To use this method, call it for every possible move you can make from the current state, giving it depth -1, -Infinity for alpha and +Infinity for beta, and it should start by being the non-moving player's turn in each of these calls - the one that returns the highest value is the best one to take.

It's very very conceptually simple. If you code it right then you never instantiate more than (depth) boards at once, you never consider pointless branches and so on.



回答2:

I am not going to profile your code for you, but since this is such a nice coding kata I wrote a small ai for tic tac toe:

import java.math.BigDecimal;

public class Board {

    /**
     * -1: opponent
     * 0: empty
     * 1: player
     */
    int[][] cells = new int[3][3];

    /**
     * the best move calculated by eval(), or -1 if no more moves are possible
     */
    int bestX, bestY;

    int winner() {
        // row
        for (int y = 0; y < 3; y++) {
            if (cells[0][y] == cells[1][y] && cells[1][y] == cells[2][y]) {
                if (cells[0][y] != 0) {
                    return cells[0][y];
                }
            }
        }

        // column
        for (int x = 0; x < 3; x++) {
            if (cells[x][0] == cells[x][1] && cells[x][1] == cells[x][2]) {
                if (cells[x][0] != 0) {
                    return cells[x][0];
                }
            }
        }

        // 1st diagonal
        if (cells[0][0] == cells[1][1] && cells[1][1] == cells[2][2]) {
            if (cells[0][0] != 0) {
                return cells[0][0];
            }
        }

        // 2nd diagonal
        if (cells[2][0] == cells[1][1] && cells[1][1] == cells[0][2]) {
            if (cells[2][0] != 0) {
                return cells[2][0];
            }
        }

        return 0; // nobody has won
    }

    /**
     * @return 1 if side wins, 0 for a draw, -1 if opponent wins
     */
    int eval(int side) {
        int winner = winner();
        if (winner != 0) {
            return side * winner;
        } else {
            int bestX = -1;
            int bestY = -1;
            int bestValue = Integer.MIN_VALUE;
        loop:
            for (int y = 0; y < 3; y++) {
                for (int x = 0; x < 3; x++) {
                    if (cells[x][y] == 0) {
                        cells[x][y] = side;
                        int value = -eval(-side);
                        cells[x][y] = 0;

                        if (value > bestValue) {
                            bestValue = value;
                            bestX = x;
                            bestY = y;
                            if (bestValue == 1) {
                                // it won't get any better, we might as well stop thinking
                                break loop;
                            }
                        }
                    }
                }
            }
            this.bestX = bestX;
            this.bestY = bestY;
            if (bestValue == Integer.MIN_VALUE) {
                // there were no moves left, it must be a draw!
                return 0;
            } else {
                return bestValue;
            }
        }
    }

    void move(int side) {
        eval(side);
        if (bestX == -1) {
            return;
        }
        cells[bestX][bestY] = side;
        System.out.println(this);

        int w = winner();
        if (w != 0) {
            System.out.println("Game over!");
        } else {
            move(-side);
        }
    }

    @Override
    public String toString() {
        StringBuilder sb = new StringBuilder();
        char[] c = {'O', ' ', 'X'};
        for (int y = 0; y < 3; y++) {
            for (int x = 0; x < 3; x++) {
                sb.append(c[cells[x][y] + 1]);
            }
            sb.append('\n');
        }
        return sb.toString();
    }

    public static void main(String[] args) {
        long start = System.nanoTime();
        Board b = new Board();
        b.move(1);
        long end = System.nanoTime();
        System.out.println(new BigDecimal(end - start).movePointLeft(9));
    }
}

The astute reader will have noticed I don't use alpha/beta cut-off. Still, on my somewhat dated notebook, this plays through a game in 0.015 seconds ...

Not having profiled your code, I can't say for certain what the problem is. However, you logging each possible move at every node in the search tree might have something to do with it.