有人可以帮我理解如何使用 alpha-beta 修剪算法吗?我正在制作一个类似于连接四的游戏。唯一的区别是没有对角线获胜,玩家可以在任何给定时间标记一个正方形(当然,除非它已经被占领)。我想我理解如何编写算法,我只是觉得我用错了。我一直在做的是有一个看起来像这样的 for 循环
for(i=0; i<size; i++)
for(j=0; j<size; j++)
val = alphabeta();
if(val > max)
max = val;
move = set(i,j);
setBoard(move); //sets the to the returned value from alphabeta()
我遇到的问题是alphabet的第一次运行返回最大值,因此下一个值都不是更大的,并且板将设置为板[0] [0]。有谁知道我做错了什么?
public int alphabeta(Placement place, int depth, int alpha, int beta, boolean maxPlayer)
{
Placement p = null;
if(depth==0 || board.isWinner())
{
return evaluate(place, maxPlayer);
}
if(maxPlayer)
{
int i=0, j=0;
for(i=0; i<board.size; i++)
{
for(j=0; j<board.size; j++)
{
if(board.validMove(i,j)&&(board.canGetFour(i,j, opponent)&&board.canGetFour(i,j,player)))
{
board.board[i][j] = opponent;
p = new Placement(i, j);
alpha = Math.max(alpha, alphabeta(p, depth-1, alpha, beta, false));
board.board[i][j] = 0;
}
if(beta<=alpha)
break;
}
if(beta<=alpha)
break;
}
return alpha;
}
else
{
int i=0, j=0;
for(i=0; i<board.size; i++)
{
for(j=0; j<board.size; j++)
{
if(board.validMove(i,j)&&(board.canGetFour(i,j,opponent)&&board.canGetFour(i,j,player)))
{
board.board[i][j] = player;
p = new Placement(i, j);
beta = Math.min(beta, alphabeta(p, depth-1, alpha, beta, true));
System.out.println(board);
board.board[i][j] = 0;
}
if(beta<=alpha)
break;
}
if(beta<=alpha)
break;
}
return beta;
}
}
这是使移动的功能
public void makeMove()
{
int max = -1;
Placement p = null;
int val = -1;
for(int i=0; i<size; i++)
for(int j=0; j<size; j++)
{
if(board.validMove(i, j))
{
if(board.canGetFour(i, j, opponent)||(board.canGetFour(i,j,player)&&board.canGetFour(i,j,opponent)))
{
board.board[i][j] = player;
val = alphabeta(new Placement(i,j), 5, -5000, 5000, true);
board.board[i][j] = 0;
if(val > max)
{
max = val;
p = new Placement(i, j);
}
}
}
}
board.board[p.row][p.col] = player;
board.moves++;
}
所以这是我更新的代码,仍然无法正常工作
public Placement alphabeta(Placement p)
{
int v = max(p,6,-500000, 500000);
return successors(v);
}
public int max(Placement p, int depth, int alpha, int beta)
{
if(depth == 0 || board.isWinner())
{
return evaluateMax(p,player);
}
int v = -500000;
for(int i=0; i<successors.size(); i++)
{
Placement place = new Placement(successors.get(i));
board.board[place.row][place.col] = player;
v = Math.max(v, min(place, depth-1, alpha,beta));
board.board[place.row][place.col] = 0;
if(v>= beta)
return v;
alpha = Math.max(alpha, v);
}
return v;
}
public int min(Placement p, int depth, int alpha, int beta)
{
if(depth == 0||board.isWinner())
{
return evaluateMax(p,opponent);
}
int v = 500000;
for(int i=0; i<successors.size(); i++)
{
Placement place = new Placement(successors.get(i));
board.board[place.row][place.col] = opponent;
v = Math.min(v, max(place,depth-1, alpha,beta));
board.board[place.row][place.col] = 0;
if(v<= alpha)
return v;
beta = Math.min(alpha, v);
}
return v;
}
public void makeMove()
{
Placement p = null;
for(int i=0; i<successors.size(); i++)
{
Placement temp = successors.get(i);
//board.board[temp.row][temp.col] = player;
p = alphabeta(temp);
//board.board[temp.row][temp.col] = 0;
}
System.out.println("My move is "+p.row + p.col);
board.board[p.row][p.col] = player;
successors.remove(p);
}
我稍微改变了算法,这样我就可以清楚地看到 min 和 max 发生了什么,但是它仍然不能正确播放