0

我正在 Reversi 游戏中为 AI 实现 MiniMax 算法,并且我正在尝试添加 Alpha-Beta 修剪方法。如果我理解正确,人工智能应该选择与标准 MiniMax 中完全相同的动作,只是在更短的时间内,这要归功于切断了分支。但是,AI玩的很傻,我试着检查了所有的分支,但是找不到原因。记录在案,没有 alpha-beta 的实现工作正常。这是功能:

public Field alphaBeta (GameBoard gb, int depth, boolean player, int alpha, int beta)
{
    /** maximum depth of search reached, we stop */
    if(depth >= max_depth) return null;

    //player = (depth+1)%2 + 1;

    /** getting a list of moves to chose from */
    ArrayList <Field> moves = findAllPossibleMoves(gb, player); 

    Field best_move = null;

    /** iterating over all possible moves, to find the best one */      
    for (int i=moves.size();--i>=0;)
    {
        /** board to simulate moves */
        GameBoard temp_board = new GameBoard(gb);
        /** getting the current move */
        Field move = moves.get(i);      
        /** simulating the move for the current node */
        game.move(move, temp_board, player);
        //Log.i("board", "Depth:"+depth+" Player:"+player+" Move:"+i+" Rating:"+evaluate(temp_board));
        Log.i("board", ""+moves.get(i).getX()+","+moves.get(i).getY());         
        temp_board.printBoard();
        Field best_deep_move = alphaBeta (gb, depth + 1, !player, alpha, beta);

        if(best_deep_move != null)
        {
            move.setRating(best_deep_move.getRating());
        }       

        /** if the maximum depth is reached, we have a null, so we evaluate */
        else 
        {
            move.setRating(evaluate (temp_board, game.getActive_player()));
        }
  if (depth%2==0) // max
        {
            alpha = Math.max(alpha, move.getRating());
        }
        else
        {
            beta = Math.min(beta, move.getRating()); 
        }   


        /** if we are not the deepest possible, we get the rating from the lower node */

        if(best_move == null) 
        {
            best_move = move;           
        }           
        else
        {   
            Log.i("update", "Current move rating:"+move.getRating());
            Log.i("update", "New move rating:"+best_move.getRating());
            if (depth%2==0)
            {
                Log.i("update", "MAX player");
                /** for us, we look for the maximum */
                if (best_move.getRating() < move.getRating()) 
                {
                    best_move = move;   
                    best_move.alpha = alpha;
                }
                if( beta <= alpha)
                {
                    break;
                }

            }
            else
            {
                Log.i("update", "MIN player");
                /** for the opponent, we look for the minimum */
                if (best_move.getRating() > move.getRating()) 
                {
                    best_move = move;   
                    best_move.beta = beta;
                }
                if( beta <= alpha)
                {
                    break;
                }
            }
            Log.i("update", "Updated move rating"+best_move.getRating());
        }
    }


    return best_move;
}

整个代码非常庞大,所以我不会在这里全部粘贴。只需询问您认为应该查看的一些函数/类,我就会立即粘贴它们。我将不胜感激任何提示。

4

0 回答 0