3

在下面的程序中,我只希望cycle3以恒定的内存运行(它明确地打结)。然而,cycle2由于我无法理解的原因,它也在不断的记忆中运行。我希望做与因为cycle2完全相同的工作cycle1

xs' = xs ++ xs'
xs' = xs ++ xs ++ xs' -- substitute value of xs'
xs' = xs ++ xs ++ xs ++ xs' -- substitute value of xs' again
xs' = xs ++ xs ++ xs ++ ... -- and so on

有人可以解释我在这里缺少什么吗?


module Main where

import System.Environment (getArgs)

cycle1 :: [a] -> [a]
cycle1 [] = error "empty list"
cycle1 xs = xs ++ cycle1 xs

cycle2 :: [a] -> [a]
cycle2 [] = error "empty list"
cycle2 xs = xs' where xs' = xs ++ xs'

cycle3 :: [a] -> [a]
cycle3 [] = error "empty list"
cycle3 xs = let
  xs' = go xs' xs
  in xs'
  where
    go :: [a] -> [a] -> [a]
    go start [last] = last : start
    go start (x:xs) = x : go start xs

testMem :: (Show a) => ([a] -> [a]) -> [a] -> IO ()
testMem f xs = print (xs', xs') -- prints only first val, need second val holds onto reference
  where
    xs' = f xs

main :: IO ()
main = do
  args <- getArgs
  let mCycleFunc = case args of
        ["1"] -> Just cycle1
        ["2"] -> Just cycle2
        ["3"] -> Just cycle3
        _ -> Nothing
  case mCycleFunc of
    Just cycleFunc -> testMem cycleFunc [0..8]
    Nothing -> putStrLn "Valid args are one of {1, 2, 3}."
4

2 回答 2

6

cycle1每次消耗一个周期时都会创建一个新列表。原因应该很明显。

cycle2,但是,不这样做。它创建一个变量xs',在它自己的定义中使用。每次你消费时cycle1都必须重新评估函数,但它没有任何递归函数。它只是引用相同的变量,该变量已经具有已知值。cycle1xscycle2

于 2012-07-09T21:58:13.583 回答
3

它归结为共享或不共享相等的thunk。两个相等的 thunk 是保证产生相同结果的 thunk。在 的情况下cycle1,您cycle1 xs每次[]xs. 需要为该 thunk 分配新内存,并且需要从头开始计算其值,这会在您遍历它时分配新的列表对。

我认为如果您重命名为(并且我删除了“错误”案例) ,cycle2避免这种情况的方法会变得更容易理解:xs'result[]

cycle2 :: [a] -> [a]
cycle2 xs = result 
    where result = xs ++ result

这个定义在语义上等同于cycle1(对相同的参数产生相同的结果),但理解内存使用的关键是根据创建的 thunk 来看待它。当您执行此函数的编译代码时,它会立即为result. 您可以将 thunk 视为或多或少像这样的可变类型(完全虚构的伪代码):

type Thunk a = union { NotDone (ThunkData a), Done a }
type ThunkData a = struct { force :: t0 -> ... -> tn -> a
                          , subthunk0 :: t0
                          , ...
                          , subthunkn :: tn }

这要么是一条记录,其中包含指向所需值的 thunk 的指针以及指向强制这些 thunk 的代码的指针,或者只是计算的结果。在 的情况下cycle2,thunk forresult指向目标代码 for和(++)thunk 。最后一位意味着 thunk for有一个指向自身的指针,这解释了常量空间行为;强制的最后一步是让它指向自身。xsresultresultresult

cycle1另一方面,在 thunk的情况下,thunk 具有 的代码、用于(++)的 thunkxs和要从头开始计算的新thunk。cycle1 xs原则上,编译器可能会认识到对后一个 thunk 的引用可以替换为对“父”块的引用,但编译器不会这样做;而在其中,cycle2它不得不这样做(一个变量的一个实例化绑定=一个块)。

请注意,这种自引用 thunk 行为可以分解为以下的合适实现fix

-- | Find the least fixed point of @f@.  This implementation should produce
-- self-referential thunks, and thus run in constant space.
fix :: (a -> a) -> a
fix f = result
    where result = f result

cycle4 :: [a] -> [a]
cycle4 xs = fix (xs++)
于 2012-07-09T23:57:08.680 回答