3

我正在尝试在 Haskell 中实现Negamax算法。

为此,我代表游戏在玫瑰树中的未来可能性(Data.Tree.Forest (depth, move, position))。但是,通常可以通过两种不同的移动顺序到达某些位置。重新评估重复位置(的子树)是一种浪费(并且很快变得非常慢)。

这是我到目前为止所尝试的:

  • 实现打结的变体以共享常见的子结果。但是,我只能找到为(可能是无限的)列表打结的解释,而没有找到关于重用子树的解释。

  • 我考虑过的另一种方法是在Statemonad 内构建一棵树,其中要保留的状态将是Map (depth, position) (Forest (depth, move, position))执行显式记忆,但到目前为止我也无法正确设置它。

我认为这两种方法都可能存在只能以核心递归方式构建博弈树的问题:我们不是从叶子到根构建树,而是从根向下懒惰地构建一个(可能是无限的)树。


编辑:给你一个我目前正在使用的代码的例子(太慢了):

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module ZeroSumGame where

import qualified Control.Arrow
import Data.Tree

import Numeric.Natural (Natural)

(|>) :: a -> (a -> b) -> b
x |> f = f x
infixl 0 |>
{-# INLINE (|>) #-}

class Ord s => Game s where
  data Move s
  initial :: s -- | Beginning of the game
  applyMove :: Natural -> s -> Move s -> s -- | Moving from one game state to the next
  possibleMoves :: Natural -> s -> [Move s] -- | Lists moves the current player is able to do.
  isGameOver :: s -> Bool -- | True if the game has ended. TODO: Maybe write default implementation using `possibleMoves state == []`?
  scorePosition :: Natural -> Move s -> s -> Int -- | Turns a position in an integer, for the Negamax algorithm to decide which position is the best.

type Trimove state = (Natural, Move state, state) -- | Depth since start of game, move to next position, new position

gameforest :: Game s => Natural -> s -> Forest (Trimove s)
gameforest start_depth start_state = unfoldForest buildNode (nextpositions start_depth start_state)
  where
    buildNode (depth, move, current_state) =
      if
        isGameOver current_state
      then
        ((depth, move, current_state), [])
      else
        ((depth, move, current_state), nextpositions depth current_state)
    nextpositions depth current_state =
      current_state
      |> possibleMoves depth
      |> fmap (\move -> (succ depth, move, applyMove depth current_state move))

scoreTree :: Game s => Ord (Move s) => Natural -> Tree (Trimove s) -> (Move s, Int)
scoreTree depth node =
  case (depth, subForest node) of
    (0, _) ->
      node |> rootLabel |> uncurry3dropFirst scorePosition
    (_, []) ->
      node |> rootLabel |> uncurry3dropFirst scorePosition
    (_, children) ->
      children
      |> scoreForest (pred depth)
      |> map (Control.Arrow.second negate)
      |> maximum

uncurry3dropFirst :: (a -> b -> c -> d) -> (a, b, c) -> (b, d)
uncurry3dropFirst fun (a, b, c) = (b, fun a b c)

scoreForest :: Game s => Ord (Move s) => Natural -> Forest (Trimove s) -> [(Move s, Int)]
scoreForest depth forest =
  forest
  |> fmap (scoreTree depth)

{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
module TicTacToe where

import qualified Control.Monad.State
import Control.Monad.State (State)
import qualified Data.Map
import Data.Map (Map)
import qualified Control.Arrow
import Data.Tree

import Data.Array (Array)
import qualified Data.Array
import qualified Data.Maybe
import qualified Data.Foldable

import Numeric.Natural (Natural)


import ZeroSumGame

data CurrentPlayer = First | Second
  deriving (Eq, Ord, Show)


instance Enum CurrentPlayer where
  fromEnum First = 1
  fromEnum Second = -1
  toEnum 1 = First
  toEnum (-1) = Second
  toEnum _ = error "Improper player"

newtype TicTacToe = TicTacToe (Array (Int, Int) (Maybe CurrentPlayer))
  deriving (Eq, Ord)

instance Game TicTacToe where
  data Move TicTacToe = TicTacToeMove (Int, Int)
    deriving (Eq, Ord, Show, Bounded)

  initial = TicTacToe initialTicTacToeBoard

  possibleMoves _depth = possibleTicTacToeMoves

  applyMove depth (TicTacToe board) (TicTacToeMove (x, y)) =
    TicTacToe newboard
    where
      newboard = board Data.Array.// [((x, y), Just player)]
      player = case depth `mod` 2 of
        0 -> First
        _ -> Second

  isGameOver state = Data.Maybe.isJust (findFilledLines state)

  scorePosition _ _ state =
          state
          |> findFilledLines
          |> fmap fromEnum
          |> Data.Maybe.fromMaybe 0
          |> (* (-10000))



findFilledLines :: TicTacToe -> Maybe CurrentPlayer
findFilledLines (TicTacToe board) =
  (rows ++ columns ++ diagonals)
  |> map winner
  |> Data.Foldable.asum
  where
    rows = vals rows_indexes
    columns = vals columns_indexes
    diagonals = vals diagonals_indexes
    rows_indexes = [[(i, j) | i <- [0..2]]| j <- [0..2]]
    columns_indexes = [[(i, j) | j <- [0..2]]| i <- [0..2]]
    diagonals_indexes = [[(i, i) ]| i <- [0..2]] ++ [[(i, 2 - i) ]| i <- [0..2]]
    vals = map (map (\index -> board Data.Array.! index))

winner :: Eq a => [Maybe a] -> Maybe a
winner [x,y,z] =
  if x == y && x == z then x else Nothing
winner _ = Nothing


initialTicTacToeBoard :: (Array (Int, Int) (Maybe CurrentPlayer))
initialTicTacToeBoard =
  Data.Array.array ((0, 0), (2, 2)) [((i, j), Nothing) | i <- [0..2], j <- [0..2]]

possibleTicTacToeMoves :: TicTacToe -> [Move TicTacToe]
possibleTicTacToeMoves (TicTacToe board) = foldr checkSquareForMove [] (Data.Array.assocs board)
    where
      checkSquareForMove (index, val) acc = case val of
        Nothing -> TicTacToeMove index : acc
        Just _ -> acc

printBoard :: TicTacToe -> String
printBoard (TicTacToe board) =
  unlines [unwords [showTile (board Data.Array.! (y, x)) | x <- [0..2]] |  y <- [0..2]]
  where
    showTile loc =
      case loc of
        Nothing -> " "
        Just Second -> "X"
        Just First -> "O"

(TypeFamilies 用于允许每个Game实现都有自己的 a 概念,Move然后需要 FlexibleContexts 来强制Move s实现Ord.

4

2 回答 2

2

问题重构

如果我正确理解了这个问题,那么您有一个函数可以返回游戏中可能的下一步动作,以及一个可以采取该动作的函数:

start :: Position
moves :: Position -> [Move]
act :: Position -> Move -> Position

以及您希望如何构建无限状态树(Depth为简单起见,请允许我忽略该字段。如果您将深度计数器视为Position类型的一部分,您会发现这里没有失去一般性):

states :: Forest (Position, Move)
states = forest start

forest :: Position -> Forest (Position, Move)
forest p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]

但是您希望以forest共享相同子树的方式实现这一目标。

走向记忆

这里的一般技术是我们想要记忆forest:这样,对于相同的Positions,我们得到共享子树。所以配方是:

forest :: Position -> Forest (Position, Move)
forest = memo forest'

forest' :: Position -> Forest (Position, Move)
forest' p = [ Node (m, p') (states p') | m <- moves p, let p' = act p m ]

我们需要一个合适的备忘录功能:

memo :: (Position -> a) -> (Position -> a)

在这一点上,我们需要了解更多Position才能知道如何使用“惰性列表”技巧来实现它……但是您会看到您不需要记忆涉及玫瑰树的函数。

于 2019-07-22T20:59:45.523 回答
0

我会尝试通过基于一些“规范”移动序列来规范棋盘位置来达到该位置。然后为每个孩子分配一个值,即在树中遍历其各自的标准化序列。(没有代码,因为我在手机上,这是一项艰巨的任务。)

其效果如何取决于在您正在玩的游戏中计算标准化移动序列的难易程度。但这是一种通过打结来引入共享的方法,利用对博弈树根的共享引用。也许它会成为适合您特定情况的其他想法的灵感。

于 2019-07-22T21:04:05.557 回答