haskell - 在固定时间内计算尽可能多的列表

Question

我想编写一个函数，它需要一个时间限制（以秒为单位）和一个列表，并在时间限制内计算尽可能多的列表元素。

我的第一次尝试是首先编写以下函数，该函数对纯计算进行计时并返回经过的时间以及结果：

import Control.DeepSeq
import System.CPUTime

type Time = Double

timed :: (NFData a) => a -> IO (a, Time)
timed x = do t1 <- getCPUTime
             r  <- return $!! x
             t2 <- getCPUTime
             let diff = fromIntegral (t2 - t1) / 10^12
             return (r, diff)

然后我可以定义我想要的功能：

timeLimited :: (NFData a) => Time -> [a] -> IO [a]
timeLimited remaining []     = return []
timeLimited remaining (x:xs) = if remaining < 0
    then return []
    else do
        (y,t) <- timed x
        ys    <- timeLimited (remaining - t) xs
        return (y:ys)

但这并不完全正确。即使忽略计时错误和浮点错误，这种方法一旦开始就不会停止列表元素的计算，这意味着它可能（实际上通常会）超出其时间限制。

相反，如果我有一个函数，如果它花费了太长时间，它可能会使评估短路：

timeOut :: Time -> a -> IO (Maybe (a,t))
timeOut = undefined

然后我可以编写我真正想要的函数：

timeLimited' :: Time -> [a] -> IO [a]
timeLimited' remaining []     = return []
timeLimited' remaining (x:xs) = do
    result <- timeOut remaining x
    case result of
        Nothing    -> return []
        Just (y,t) -> do
            ys <- timeLimited' (remaining - t) xs
            return (y:ys)

我的问题是：

我该怎么写timeOut？
是否有更好的方法来编写函数timeLimited，例如，不会因多次累加时间差而遭受累积浮点误差的影响？

score 13 · Accepted Answer

这是一个我可以使用上面的一些建议来制作的例子。我没有进行大量测试以确保在计时器用完时准确地切断工作，但根据文档timeout，这应该适用于所有不使用 FFI 的东西。

import Control.Concurrent.STM
import Control.DeepSeq
import System.Timeout

type Time = Int

-- | Compute as many items of a list in given timeframe (microseconds)
--   This is done by running a function that computes (with `force`)
--   list items and pushed them onto a `TVar [a]`.  When the requested time
--   expires, ghc will terminate the execution of `forceIntoTVar`, and we'll
--   return what has been pushed onto the tvar.
timeLimited :: (NFData a) => Time -> [a] -> IO [a]
timeLimited t xs = do
    v <- newTVarIO []
    _ <- timeout t (forceIntoTVar xs v)
    readTVarIO v 

-- | Force computed values into given tvar
forceIntoTVar :: (NFData a) => [a] -> TVar [a] -> IO [()]
forceIntoTVar xs v = mapM (atomically . modifyTVar v . forceCons) xs

-- | Returns function that does actual computation and cons' to tvar value
forceCons :: (NFData a) => a -> [a] -> [a]
forceCons x = (force x:)

现在让我们尝试一些昂贵的东西：

main = do
    xs <- timeLimited 100000 expensiveThing   -- run for 100 milliseconds
    print $ length $ xs  -- how many did we get?

-- | Some high-cost computation
expensiveThing :: [Integer]
expensiveThing = sieve [2..]
  where
      sieve (p:xs) = p : sieve [x|x <- xs, x `mod` p > 0]

用编译和运行time，它似乎可以工作（显然在定时部分之外有一些开销，但我大约是 100 毫秒：

$ time ./timeLimited
1234
./timeLimited  0.10s user 0.01s system 97% cpu 0.112 total

另外，关于这种方法的一些注意事项；因为我在一次调用中将运行计算并将它们推送到 tvar 的整个操作封闭起来，所以timeout在创建返回结构时可能会丢失一些时间，尽管我假设（如果你的计算成本很高）它不会t 占或大部分时间。

更新

现在我已经有时间考虑了，由于 Haskell 的懒惰，我不是 100% 肯定上面的注释（关于创建回报结构所花费的时间）是正确的；无论哪种方式，如果这对于您要完成的工作来说不够精确，请告诉我。

score 4 · Accepted Answer

您可以timeOut使用您提供的类型来实现timeoutand evaluate。它看起来像这样（我省略了计算剩余时间的部分——使用getCurrentTime或类似的部分）：

timeoutPure :: Int -> a -> IO (Maybe a)
timeoutPure t a = timeout t (evaluate a)

如果你想要更多的强制，而不仅仅是弱头范式，你可以用一个已经序列化的参数来调用它，例如，timeoutPure (deepseq v)而不是timeoutPure v.

score 2 · Accepted Answer

当达到超时时，我将与 TVar 一起使用两个线程并在计算线程中引发异常（导致每个正在进行的事务回滚）：

forceIntoTVar :: (NFData a) => [a] -> TVar [a] -> IO [()]
forceIntoTVar xs v = mapM (atomically . modifyTVar v . forceCons) xs

-- | Returns function that does actual computation and cons' to tvar value
forceCons :: (NFData a) => a -> [a] -> [a]
forceCons x = (force x:)

main = do

  v <- newTVarIO []
  tID <- forkIO $ forceIntoTVar args v
  threadDelay 200
  killThread tID
  readTVarIO v

在此示例中，您（可能）需要稍微调整forceIntoTVar以便例如列表节点不在原子事务内部计算，而是首先计算，然后启动原子事务以将它们添加到列表中。

在任何情况下，当引发异常时，正在进行的事务被回滚或正在进行的计算在结果被提供给列表之前停止，这就是你想要的。

您需要考虑的是，当准备节点的单个计算以高频率运行时，与不使用 STM 相比，此示例可能非常昂贵。

haskell - 在固定时间内计算尽可能多的列表

3 回答 3

Related

Reference