0

有人可以帮我理解如何使用 alpha-beta 修剪算法吗?我正在制作一个类似于连接四的游戏。唯一的区别是没有对角线获胜,玩家可以在任何给定时间标记一个正方形(当然,除非它已经被占领)。我想我理解如何编写算法,我只是觉得我用错了。我一直在做的是有一个看起来像这样的 for 循环

for(i=0; i<size; i++)
    for(j=0; j<size; j++)
        val = alphabeta();
            if(val > max)
                max = val;
                move = set(i,j);
setBoard(move); //sets the to the returned value from alphabeta()

我遇到的问题是alphabet的第一次运行返回最大值,因此下一个值都不是更大的,并且板将设置为板[0] [0]。有谁知道我做错了什么?

public int alphabeta(Placement place, int depth, int alpha, int beta, boolean maxPlayer)
{

    Placement p = null;
    if(depth==0 || board.isWinner())
    {
        return evaluate(place, maxPlayer);
    }
    if(maxPlayer)
    {
        int i=0, j=0;
        for(i=0; i<board.size; i++)
        {
            for(j=0; j<board.size; j++)
            {
                if(board.validMove(i,j)&&(board.canGetFour(i,j, opponent)&&board.canGetFour(i,j,player)))
                {
                    board.board[i][j] = opponent;
                    p = new Placement(i, j);
                    alpha = Math.max(alpha, alphabeta(p, depth-1, alpha, beta, false));
                    board.board[i][j] = 0;
                }

                if(beta<=alpha)
                    break;
            }
            if(beta<=alpha)
                break;
        }
        return alpha;
    }
    else
    {
        int i=0, j=0;
        for(i=0; i<board.size; i++)
        {
            for(j=0; j<board.size; j++)
            {
                if(board.validMove(i,j)&&(board.canGetFour(i,j,opponent)&&board.canGetFour(i,j,player)))
                {
                    board.board[i][j] = player;
                    p = new Placement(i, j);
                    beta = Math.min(beta, alphabeta(p, depth-1, alpha, beta, true));
                    System.out.println(board);
                    board.board[i][j] = 0;
                }
                if(beta<=alpha)
                    break;
            }
            if(beta<=alpha)
                break;
        }
        return beta;
    }
}

这是使移动的功能

public void makeMove()
{
    int max = -1;
    Placement p = null;
    int val = -1;
    for(int i=0; i<size; i++)
        for(int j=0; j<size; j++)
        {   
            if(board.validMove(i, j))
            {
                if(board.canGetFour(i, j, opponent)||(board.canGetFour(i,j,player)&&board.canGetFour(i,j,opponent)))
                {
                    board.board[i][j] = player;
                    val = alphabeta(new Placement(i,j), 5, -5000, 5000, true);
                    board.board[i][j] = 0;
                    if(val > max)
                    {
                        max = val;
                        p = new Placement(i, j);
                    }
                }
            }
        }
    board.board[p.row][p.col] = player;
    board.moves++;
}

所以这是我更新的代码,仍然无法正常工作

public Placement alphabeta(Placement p)
{
    int v = max(p,6,-500000, 500000);
    return successors(v);
}
public int max(Placement p, int depth, int alpha, int beta)
{
    if(depth == 0 || board.isWinner())
    {
        return evaluateMax(p,player);
    }
    int v = -500000;
    for(int i=0; i<successors.size(); i++)
    {
        Placement place = new Placement(successors.get(i));
        board.board[place.row][place.col] = player;
        v = Math.max(v, min(place, depth-1, alpha,beta));
        board.board[place.row][place.col] = 0;
        if(v>= beta)
            return v;
        alpha = Math.max(alpha, v);
    }
    return v;
}
public int min(Placement p, int depth, int alpha, int beta)
{
    if(depth == 0||board.isWinner())
    {
        return evaluateMax(p,opponent);
    }
    int v = 500000;
    for(int i=0; i<successors.size(); i++)
    {
        Placement place = new Placement(successors.get(i));
        board.board[place.row][place.col] = opponent;
        v = Math.min(v, max(place,depth-1, alpha,beta));
        board.board[place.row][place.col] = 0;
        if(v<= alpha)
            return v;
        beta = Math.min(alpha, v);
    }
    return v;
}
public void makeMove()
{
    Placement p = null;
    for(int i=0; i<successors.size(); i++)
    {
        Placement temp = successors.get(i);
        //board.board[temp.row][temp.col] = player;
        p = alphabeta(temp);
        //board.board[temp.row][temp.col] = 0;
    }
    System.out.println("My move is "+p.row + p.col);
    board.board[p.row][p.col] = player;
    successors.remove(p);
}

我稍微改变了算法,这样我就可以清楚地看到 min 和 max 发生了什么,但是它仍然不能正确播放

4

1 回答 1

0

好的,花了一些时间,但我想我有。

在您的评估函数中,您应该返回状态对实际玩家有多好。如果放置是canGetFour针对“otherPlayer”的,那是一个不好的状态(最坏的状态)。所以你返回一个小数字。但是,如果放置是canGetFour“actualPlayer”的一个,您会返回一个大数字(它的状态很好)。

然后在您的 makeMove 中,您只是检查状态是否是可能的最佳状态。请注意,为此使用 2d 数组几乎是存储“子节点”的效率最低的方法。拥有一个placement.getPossibleMoves() 会更有意义,它返回一个包含所有空方格(真实的和临时的)的数组,并对其进行迭代。否则,您的算法将按电路板大小的顺序呈指数级增长。

private Placement bestNext;
private List<Placement> tempMoves = new ArrayList<>();
private int alpha;
private int beta;

public int alphabeta(Placement place, int depth, boolean maxPlayer)
{
    Placement p = null;
    if(depth == maxDepth){/* (unnasigned squares in actual board) */
       return evaluate(place, maxPlayer)
    }
    int i=0, j=0;
    for(i=0; i<board.size; i++)
    {
        for(j=0; j<board.size; j++)
        {
            if(board.validMove(i,j)){
                p = new Placement(i, j);
                tempMoves.add(placement);
                int tmp = Math.max(alpha, alphabeta(p, depth += 1, actualPlayer.getOpponent()));
                if(maxPlayer){
                  alpha = tmp
                }
                else{
                  beta = tmp
                }
                tempMoves.remove(placement);
            }

            if(beta<=alpha)
                break;
        }
        if(beta<=alpha)
            break;
    }
    return maxPlayer ? alpha : beta;
}
于 2014-12-01T02:55:59.627 回答