performance - Haskell foldl'（++）性能不佳

Question

我有这个代码：

import Data.List

newList_bad  lst = foldl' (\acc x -> acc ++ [x*2]) [] lst
newList_good lst = foldl' (\acc x -> x*2 : acc) [] lst

这些函数返回每个元素乘以 2 的列表：

*Main> newList_bad [1..10]
[2,4,6,8,10,12,14,16,18,20]
*Main> newList_good [1..10]
[20,18,16,14,12,10,8,6,4,2]

在 ghci 中：

*Main> sum $ newList_bad [1..15000]
225015000
(5.24 secs, 4767099960 bytes)
*Main> sum $ newList_good [1..15000]
225015000
(0.03 secs, 3190716 bytes)

为什么newList_bad函数的工作速度比慢 200 倍newList_good？我知道这不是该任务的好解决方案。但是为什么这个无辜的代码工作得这么慢呢？

这是什么“4767099960字节”？对于那个简单的操作，Haskell 使用了 4 GiB？

编译后：

C:\1>ghc -O --make test.hs
C:\1>test.exe
225015000
Time for sum (newList_bad [1..15000]) is 4.445889s
225015000
Time for sum (newList_good [1..15000]) is 0.0025005s

score 19 · Accepted Answer

关于这个问题有很多困惑。给出的通常原因是“在列表末尾重复附加需要重复遍历列表，因此是O(n^2)”。但在严格的评估下，它只会如此简单。在惰性评估下，一切都应该被延迟，所以它引出了一个问题，是否真的存在这些重复的遍历和附加。最后的添加是由前面的消费触发的，由于我们在前面消费，所以列表越来越短，所以这些动作的确切时间是不清楚的。所以真正的答案更微妙，并处理惰性求值下的具体归约步骤。

直接的罪魁祸首是foldl'仅将其累加器参数强制为弱头范式 - 即直到暴露非严格构造函数。这里涉及的功能是

(a:b)++c = a:(b++c)    -- does nothing with 'b', only pulls 'a' up
[]++c = c              -- so '++' only forces 1st elt from its left arg

foldl' f z [] = z
foldl' f z (x:xs) = let w=f z x in w `seq` foldl' f w xs

sum xs = sum_ xs 0     -- forces elts fom its arg one by one
sum_ [] a = a
sum_ (x:xs) a = sum_ xs (a+x)

所以实际的归约序列是 (with g = foldl' f)

sum $ foldl' (\acc x-> acc++[x^2]) []          [a,b,c,d,e]
sum $ g  []                                    [a,b,c,d,e]
      g  [a^2]                                   [b,c,d,e]
      g  (a^2:([]++[b^2]))                         [c,d,e]
      g  (a^2:(([]++[b^2])++[c^2]))                  [d,e]
      g  (a^2:((([]++[b^2])++[c^2])++[d^2]))           [e]
      g  (a^2:(((([]++[b^2])++[c^2])++[d^2])++[e^2]))   []
sum $ (a^2:(((([]++[b^2])++[c^2])++[d^2])++[e^2]))

O(n)请注意，到目前为止，我们只执行了步骤。a^2可在此处立即用于sum的消费，但b^2不是。我们留下了++表达式的左嵌套结构。其余的最好由 Daniel Fischer 在这个答案中解释。它的要点是，要b^2退出，O(n-1)必须执行步骤 - 在此访问之后留下的结构仍将是左嵌套的，因此下一次访问将采取O(n-2)步骤，依此类推 - 经典O(n^2)行为。所以真正的原因是它++ 没有强迫或重新安排它的论点以达到有效的程度。

这实际上是违反直觉的。我们可以期待惰性评估在这里神奇地为我们“做”。毕竟我们只是表达了将来[x^2]添加到列表末尾的意图，我们实际上并没有立即这样做。所以这里的时间是关闭的，但它可以正确 - 当我们访问列表时，如果时间正确，新元素将被添加到其中并立即使用：如果将在（空间方面）之后添加到列表中)，也就是说，就在（及时）被消耗之前，遍历/访问将始终为.c^2b^2 b^2O(1)

这是通过所谓的“差异列表”技术实现的：

newlist_dl lst = foldl' (\z x-> (z . (x^2 :)) ) id lst

++[x^2]如果您想一想，它看起来与您的版本完全相同。它表达了相同的意图，并且也留下了左嵌套结构。

正如 Daniel Fischer 在同一个答案中所解释的那样，不同之处在于chain(.)在第一次被强制时，会逐步将自身重新排列成一个右嵌套($)结构¹，O(n)之后每次访问都是O(1)并且附加的时间是最佳的，正如在上面的段落，所以我们留下了整体O(n)行为。

¹这有点神奇，但它确实发生了。:)

score 15 · Accepted Answer

经典列表行为。

记起：

(:)  -- O(1) complexity
(++) -- O(n) complexity

所以你正在创建一个 O(n^2) 算法，而不是一个 O(n) 算法。

对于这种以增量方式附加到列表的常见情况，请尝试使用dlist，或者只是在最后反转。

score 3 · Accepted Answer

用更大的视角补充其他答案：对于惰性列表，foldl'在返回列表的函数中使用通常是一个坏主意。 foldl'当您将列表缩减为严格（非惰性）标量值（例如，对列表求和）时，它通常很有用。但是当你建立一个列表作为结果时，foldr通常会更好，因为懒惰；构造函数是惰性的:，因此在实际需要之前不会计算列表的尾部。

在你的情况下：

newList_foldr lst = foldr (\x acc -> x*2 : acc) [] lst

这实际上与map (*2)：

newList_foldr lst = map (*2) lst
map f lst = foldr (\x acc -> f x : acc) [] lst

评估（使用第一个map-less 定义）：

newList_foldr [1..10] 
  = foldr (\x acc -> x*2 : acc) [] [1..10]
  = foldr (\x acc -> x*2 : acc) [] (1:[2..10])
  = 1*2 : foldr (\x rest -> f x : acc) [] [2..10]

这大约是 Haskell 将评估何时newList [1..10]被强制。如果这个结果的消费者需要它，它只会进一步评估 - 并且只需要满足消费者的需求。例如：

firstElem [] = Nothing
firstElem (x:_) = Just x

firstElem (newList_foldr [1..10])
  -- firstElem only needs to evaluate newList [1..10] enough to determine 
  -- which of its subcases applies—empty list or pair.
  = firstElem (foldr (\x acc -> x*2 : acc) [] [1..10])
  = firstElem (foldr (\x acc -> x*2 : acc) [] (1:[2..10]))
  = firstElem (1*2 : foldr (\x rest -> f x : acc) [] [2..10])
  -- firstElem doesn't need the tail, so it's never computed!
  = Just (1*2)

这也意味着foldr-basednewList也可以与无限列表一起使用：

newList_foldr [1..] = [2,4..]
firstElem (newList_foldr [1..]) = 2

foldl'另一方面，如果使用，则必须始终计算整个列表，这也意味着您不能处理无限列表：

firstElem (newList_good [1..])    -- doesn't terminate

firstElem (newList_good [1..10])
  = firstElem (foldl' (\acc x -> x*2 : acc) [] [1..10])
  = firstElem (foldl' (\acc x -> x*2 : acc) [] (1:[2..10]))
  = firstElem (foldl' (\acc x -> x*2 : acc) [2] [2..10])
  -- we can't short circuit here because the [2] is "inside" the foldl', so 
  -- firstElem can't see it
  = firstElem (foldl' (\acc x -> x*2 : acc) [2] (2:[3..10]))
  = firstElem (foldl' (\acc x -> x*2 : acc) [4,2] [3..10])
    ...
  = firstElem (foldl' (\acc x -> x*2 : acc) [18,16,14,12,10,8,6,4,2] (10:[]))
  = firstElem (foldl' (\acc x -> x*2 : acc) [20,18,16,14,12,10,8,6,4,2] [])
  = firstElem [20,18,16,14,12,10,8,6,4,2]
  = firstElem (20:[18,16,14,12,10,8,6,4,2])
  = Just 20

foldr基于 - 的算法需要 4 个步骤来计算firstElem_foldr (newList [1..10])，而基于-的算法需要foldl'大约 21 个步骤。更糟糕的是，这 4 个步骤是固定成本，而 21 与输入列表的长度成正比——<code>firstElem (newList_good [1..150000]) 需要 300,001 步，而firstElem (newList_foldr [1..150000]需要 5 步，firstElem (newList_foldr [1..]就像那件事。

另请注意，firstElem (newList_foldr [1.10])在恒定空间和恒定时间中运行（它必须；您需要比恒定时间更多的时间来分配比恒定空间更多的空间）。严格foldl语言的突出性——“foldl是尾递归的并且在恒定空间中运行，foldr不是尾递归并且在线性空间中运行或更糟”——在 Haskell 中是不正确的。

performance - Haskell foldl'（++）性能不佳

3 回答 3

Related

Reference