3

我想知道我在此页面上读到的空间泄漏示例(遗憾的是那里没有解释): https ://en.wikibooks.org/wiki/Haskell/Graph_reduction

棘手的空间泄漏示例:

(\xs -> 头 xs + 最后 xs) [1..n]

(\xs -> 最后一个 xs + 头 xs) [1..n]

第一个版本在 O(1) 空间上运行。O(n) 中的第二个。

我不确定,我是否理解正确(希望你能提供帮助)。据我所知,懒惰的评价是指最左边最外层的评价。因此,在这些示例中,我们将减少这些 redexes,如下所示:

(\xs -> 头 xs + 最后 xs) [1..200]

=> ([1..200] -> 头部 xs + 最后 xs)

=> 头部 [1..200] + 最后 [1..200]

=> 1 + 最后 [1..200]

=> 1 + 最后 [2..200]

=> ... ...

=> 1 + 最后一个 [199..200]

=> 1 + 最后 [200]

=> 1 + 200

=> 201

(\xs -> 最后一个 xs + 头 xs) [1..200]

([1..200] -> 最后 xs + 头 xs)

最后 [1..200] + 头部 [1..200]

最后 [2..200] + 头部 [1..200]

……

最后 [199..200] + 头部 [1..200]

最后 [200] + 头部 [1..200]

200 + 头 [1..200]

200 + 1

201

也许我以错误的方式减少了它(请纠正我,如果我错了),但在这里我看不到可能的空间泄漏。所以,我首先用 ghci 测试了运行时(不是空间):

1>> (\xs -> head xs + last xs) [1..50000000]
50000001
(10.05 secs, 4,003,086,632 bytes)
2>> (\xs -> last xs + head xs) [1..50000000]
50000001
(2.26 secs, 3,927,748,176 bytes)

根据 wikibooks,第二个版本应该存在空间泄漏,但运行时要快得多(这可能是可能的,这里没什么奇怪的)。

我有以下源代码:

module Main where

main = do
--  let a = (\xs -> head xs + last xs) [1..20000000]    -- space leak
let b = (\xs -> last xs + head xs) [1..20000000]    -- no space leak
--  putStrLn ("result for a - (\\xs -> head xs + last xs): " ++ show a)
putStrLn ("result for b - (\\xs -> last xs + head xs): " ++ show b)

...我编译它没有优化,然后我调用了程序:

$ ghc -O0 --make -rtsopts -fforce-recomp Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...

$ ./Main +RTS -sstderr
result for b - (\xs -> last xs + head xs): 20000001
   1,600,057,352 bytes allocated in the heap
         211,000 bytes copied during GC
          44,312 bytes maximum residency (2 sample(s))
          21,224 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3062 colls,     0 par    0.012s   0.012s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.000s   0.000s     0.0002s    0.0002s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.518s  (  0.519s elapsed)
  GC      time    0.012s  (  0.012s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.534s  (  0.532s elapsed)

  %GC     time       2.3%  (2.3% elapsed)

  Alloc rate    3,088,101,743 bytes per MUT second

  Productivity  97.6% of total user, 98.0% of total elapsed

这是一个很好的结果,我们有 2.3% 的垃圾收集,使用的内存大约是 1 MB。然后我编译了另一个没有优化的案例,得到了以下结果:

module Main where

main = do
  let a = (\xs -> head xs + last xs) [1..20000000]    -- space leak
--  let b = (\xs -> last xs + head xs) [1..20000000]    -- no space leak
  putStrLn ("result for a - (\\xs -> head xs + last xs): " ++ show a)
--  putStrLn ("result for b - (\\xs -> last xs + head xs): " ++ show b)

$ ghc -O0 --make -rtsopts -fforce-recomp Main.hs
[1 of 1] Compiling Main             ( Main.hs, Main.o )
Linking Main ...

$ ./Main +RTS -sstderr
result for a - (\xs -> head xs + last xs): 20000001
   1,600,057,352 bytes allocated in the heap
   2,088,615,552 bytes copied during GC
     540,017,504 bytes maximum residency (13 sample(s))
     135,620,768 bytes maximum slop
            1225 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3051 colls,     0 par    0.911s   0.915s     0.0003s    0.0016s
  Gen  1        13 colls,     0 par    2.357s   2.375s     0.1827s    0.9545s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.434s  (  0.430s elapsed)
  GC      time    3.268s  (  3.290s elapsed)
  EXIT    time    0.094s  (  0.099s elapsed)
  Total   time    3.799s  (  3.820s elapsed)

  %GC     time      86.0%  (86.1% elapsed)

  Alloc rate    3,687,222,801 bytes per MUT second

  Productivity  14.0% of total user, 13.9% of total elapsed

这更糟糕,有很多垃圾收集正在进行,内存使用率要高得多。

这里到底发生了什么?我不明白为什么会有空间泄漏。


PS 如果你用完全优化 (O2) 编译这两种情况,那么这两个程序几乎同样有效。

4

1 回答 1

4

xs是一样的。当您写出[1..200]两次时,您会感到困惑。最好在表达式求值过程中显式命名所有临时实体:

(\xs -> head xs + last xs) [1,2,3]
head xs + last xs        { xs = [1,2,3] }
(+) (head xs) (last xs)
(+) 1         (last xs)  { xs = 1:xs1 ; xs1 = [2,3] }
              (last xs1) { xs1 = 2:xs2 ; xs2 = [3] }
              (last xs2)
              3
4

在这里,我们看到xs(and xs1) 的绑定可以在我们继续进行时安全地忘记,因为没有更多对xs任何地方的引用。

所以你确实正确地减少了它,除了第二个[1..200]与第一个相同,所以我们必须在首先找到它的时候抓住它last(在第二个变体中),因此泄漏。

当然,允许编译器对此进行优化,由于引用透明原则,用 equals 替换 equals ,并执行[1..200] 两次枚举,因此第二个变体也在O(1)空间中运行。

所以最后,它是一个编译器的事情。空间泄漏可能会发生(不应该)。

于 2016-06-09T13:51:58.683 回答