java - MiniMax 实现

Question

我正在尝试用 Java 编写一个小型 AI 算法来实现 miniMax 算法。

以此为基础的游戏是两人游戏，其中两名玩家每回合移动一次，每个棋盘位置导致每个玩家得分。玩家 X 的位置的“质量”是通过从玩家 X 在该位置的得分中减去对手的得分来评估的。每一步都用一个整数表示（即输入 1 移动一，输入 2 移动二等）

我知道 miniMax 应该使用递归来实现。目前我有：

一种evaluate()方法，它将表示棋盘状态的对象（即“BoardState”对象和称为“max”的布尔值（签名将是evaluate(BoardState myBoard, boolean max)）作为参数。

轮到玩家 X 时 max 为真。给定一个棋盘位置，它将评估所有可能的移动并返回对玩家 X 最有利的移动。如果轮到对手，max 将为 false，该方法将返回对玩家 X 最不利的移动（即：对玩家 y 最有利）

但是，我在编写实际miniMax方法时遇到了困难。我的一般结构是这样的：

public int miniMax(GameState myGameState, int depth)

因此我提交了初始的游戏状态和我希望它研究的“深度”。

然后我会有类似的东西：

int finalMove = 0;
while(currentDepth < depth) {
    GameState tmp = myGameState.copy();
    int finalMove = evaluate(tmp, true or false);
    iniMax(tmp.makeMove(finalMove));

}
return finalMove;

这听起来像是一个合理的实现吗？有什么建议么？:)

谢谢！

score 3 · Accepted Answer

那行不通。

细节：

它会导致无限循环。currentdepth 永远不会增加
您对评估的定义似乎与大多数人不同。通常评估函数会返回游戏状态的预测值。您对评估函数的定义不与 minimax 函数的定义相同吗？
miniMax 和 MiniMax 有区别吗？因为如果你的意思是递归，那么你需要在调用下一个 miniMax 时传递 depth-1

minimax 的思想是深度优先搜索。并且仅评估叶节点（具有最大深度的节点或获胜或平局的节点），如果当前玩家是最大的节点，则选择最大的节点，如果当前玩家是最小的节点，则选择最小的节点。

这就是我实现它的方式：

function miniMax(node, depth)
    if(depth == 0) then --leaf node     
        local ret = evaluate(node.state)
        return ret
    else -- winning node
        local winner = whoWin(node.state)       
        if(winner == 1) then -- P1      
            return math.huge
        elseif(winner == -1) then -- P2         
            return math.huge*-1
        end 
    end

    local num_of_moves = getNumberOfMoves(node.state)   
    local v_t = nil
    local best_move_index = nil 
    if(getTurn(node.state) == 1) then -- maximizing player      
    local v = -math.huge 
        for i=0, num_of_moves-1 do                      
            local child_node = simulate(node.state, i)) -- simulate move number i
            v_t = miniMax(child_node, depth-1)
            if(v_t > v) then 
                v = v_t 
                best_move_index = i             
            end
        end             
        if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end 
        return v, best_move_index
    else -- minimizing player
    local v = math.huge
        for i=0, num_of_moves-1 do          
            local child_node = simulate(node.state, i)
            v_t = miniMax(child_node, depth-1)
            if(v_t < v) then                
                v = v_t             
                best_move_index = i             
            end
        end
        if(best_move_index == nil) then best_move_index = random(0, num_of_moves-1) end
        return v, best_move_index                               
    end 
end

笔记：

return v，best_move_index表示返回v和best_move_index两个值（以上代码在lua中，lua可以返回多个值）
评估函数为两个玩家返回相同的分数（即从 P1 的角度来看，游戏状态 A 得 23 分，从 P2 的角度来看也得 23 分）
此算法仅在两个玩家交替运行时才有效（没有玩家可以连续运行两个动作），您可以通过给对手一个动作来欺骗此限制，即如果另一个玩家需要移动 PASS（跳过他/她的回合）移动两次。
这个极小极大可以进一步优化（从最简单的排序）：
1. α-β修剪
2. 迭代深化
3. 移动排序

score 0 · Accepted Answer

我在lua中实现了minimax。我希望它可以帮助您了解如何从 Java 角度处理算法，代码应该与您的想法非常相似。它专为井字游戏而设计。

--caller is the player who is using the minimax function
--initial state is the board from which the player must make a move
local function minimax(caller,inital_state)
    local bestState = {}; --we use this to store the game state the the player will create
      --this recurse function is really the 'minimax' algorithim 
    local function recurse(state,depth)
        --childPlayer is the person who will have their turn in the current state's children
        local ChildPlayer = getTurn(state)
        --parentPlayer is the person who is looking at their children
        local ParentPlayer = getPreviousTurn(state)
        --this represents the worst case scenario for a player
        local alpha =  - (ChildPlayer == caller and 1 or -1 );
        --we check for terminal conditions (leaf nodes) and return the appropriate objective value
        if win(state) then
            --return +,- inf depending on who called the 'minimax'
            return ParentPlayer == caller and 1 or -1;
        elseif tie(state) then 
            --if it's a tie then the value is 0 (neither win or loss)
            return  0;
        else
            --this will return a list of child states FROM the current state
            children = getChildrenStates(state,ChildPlayer)
            --enumerate over each child
            for _,child in ipairs(children) do
                --find out the child's objective value
                beta = recurse(child,depth+1);
                if ChildPlayer == caller then 
                    --We want to maximize
                    if beta >= alpha then 
                        alpha = beta
                        --if the depth is 0 then we can add the child state as the bestState (this will because the caller of minimax will always want to choose the GREATEST value on the root node)
                        if depth == 0 then
                            bestState = child;
                        end 
                    end 
                --we want to MINIMIZE
                elseif beta <= alpha then 
                    alpha = beta;
                end 
            end 
        end
         --return a non-terminal nodes value (propagates values up the tree)
        return alpha;           
    end
     --start the 'minimax' function by calling recurse on the initial state
    recurse(inital_state,0);
     --return the best move
    return bestState;
end

java - MiniMax 实现

2 回答 2

Related

Reference