6

假设我们有一个 IO 动作,例如

lookupStuff :: InputType -> IO OutputType

这可能是一些简单的事情,例如 DNS 查找,或者针对时不变数据的一些 Web 服务调用。

让我们假设:

  1. 该操作从不抛出任何异常和/或从不发散

  2. 如果不是IOmonad,则该函数将是纯函数,即相同输入参数的结果始终相同

  3. 该操作是可重入的,即可以安全地同时从多个线程调用它。

  4. lookupStuff操作非常(时间)昂贵。

我面临的问题是如何正确(并且不使用任何unsafe*IO*作弊)实现可重入缓存,该缓存可以从多个线程调用,并将针对相同输入参数的多个查询合并到一个请求中。

我想我在追求与 GHC 的纯计算黑洞概念类似的东西,但在 IO“计算”上下文中。

对于所述问题,惯用的 Haskell/GHC 解决方案是什么?

4

2 回答 2

4

Yeah, basically reimplement the logic. Although it seems similar to what GHC is already doing, that's GHC's choice. Haskell can be implemented on VMs that work very differently, so in that sense it isn't already done for you.

But yeah, just use an MVar (Map InputType OutputType) or even an IORef (Map InputType OutputType) (make sure to modify with atomicModifyIORef), and just store the cache in there. If this imperative solution seems wrong, it's the "if not for the IO, this function would be pure" constraint. If it were just an arbitrary IO action, then the idea that you would have to keep state in order to know what to execute or not seems perfectly natural. The problem is that Haskell does not have a type for "pure IO" (which, if it depends on a database, it is just behaving pure under certain conditions, which is not the same as being a hereditarily pure).

import qualified Data.Map as Map
import Control.Concurrent.MVar

-- takes an IO function and returns a cached version
cache :: (Ord a) => (a -> IO b) -> IO (a -> IO b)
cache f = do
    r <- newMVar Map.empty
    return $ \x -> do
        cacheMap <- takeMVar r
        case Map.lookup x cacheMap of
            Just y -> do 
                putMVar r cacheMap
                return y
            Nothing -> do
                y <- f x
                putMVar (Map.insert x y cacheMap)
                return y

Yeah it's ugly on the inside. But on the outside, look at that! It's just like the type of a pure memoization function, except for it has IO stained all over it.

于 2011-03-30T15:44:45.243 回答
2

这是一些代码或多或少地实现了我在原始问题中所追求的:

import           Control.Concurrent
import           Control.Exception
import           Data.Either
import           Data.Map           (Map)
import qualified Data.Map           as Map
import           Prelude            hiding (catch)

-- |Memoizing wrapper for 'IO' actions
memoizeIO :: Ord a => (a -> IO b) -> IO (a -> IO b)
memoizeIO action = do
  cache <- newMVar Map.empty
  return $ memolup cache action

  where
    -- Lookup helper
    memolup :: Ord a => MVar (Map a (Async b)) -> (a -> IO b) -> a -> IO b
    memolup cache action' args = wait' =<< modifyMVar cache lup
      where
        lup tab = case Map.lookup args tab of
          Just ares' ->
            return (tab, ares')
          Nothing    -> do
            ares' <- async $ action' args
            return (Map.insert args ares' tab, ares')

上面的代码建立在 Simon Marlow 的Async抽象之上,如教程:Haskell 中的并行和并发编程中所述:

-- |Opaque type representing asynchronous results.
data Async a = Async ThreadId (MVar (Either SomeException a))

-- |Construct 'Async' result. Can be waited on with 'wait'.
async :: IO a -> IO (Async a)
async io = do
  var <- newEmptyMVar
  tid <- forkIO ((do r <- io; putMVar var (Right r))
                 `catch` \e -> putMVar var (Left e))
  return $ Async tid var

-- |Extract value from asynchronous result. May block if result is not
-- available yet. Exceptions are returned as 'Left' values.
wait :: Async a -> IO (Either SomeException a)
wait (Async _ m) = readMVar m

-- |Version of 'wait' that raises exception.
wait' :: Async a -> IO a
wait' a = either throw return =<< wait a

-- |Cancels asynchronous computation if not yet completed (non-blocking).
cancel :: Async a -> IO ()
cancel (Async t _) = throwTo t ThreadKilled
于 2011-06-21T11:01:58.200 回答