6

I have a computation inside ST which allocates memory through a Data.Vector.Unboxed.Mutable. The vector is never read or written, nor is any reference retained to it outside of runST (to the best of my knowledge). The problem I have is that when I run my ST computation multiple times, I sometimes seem to keep the memory for the vector around.

Allocation statistics:

5,435,386,768 bytes allocated in the heap
    5,313,968 bytes copied during GC
  134,364,780 bytes maximum residency (14 sample(s))
    3,160,340 bytes maximum slop
          518 MB total memory in use (0 MB lost due to fragmentation)

Here I call runST 20x with different values for my computation and a 128MB vector (again - unused, not returned or referenced outside of ST). The maximum residency looks good, basically just my vector plus a few MB of other stuff. But the total memory use indicates that I have four copies of the vector active at the same time. This scales perfectly with the size of the vector, for 256MB we get 1030MB as expected.

Using a 1GB vector runs out of memory (4x1GB + overhead > 32bit). I don't understand why the RTS keeps seemingly unused, unreferenced memory around instead of just GC'ing it, at least at the point where an allocation would otherwise fail.

Running with +RTS -S reveals the following:

    Alloc    Copied     Live    GC    GC     TOT     TOT  Page Flts
    bytes     bytes     bytes  user  elap    user    elap
134940616     13056 134353540  0.00  0.00    0.09    0.19    0    0  (Gen:  1)
   583416      6756 134347504  0.00  0.00    0.09    0.19    0    0  (Gen:  0)
   518020     17396 134349640  0.00  0.00    0.09    0.19    0    0  (Gen:  1)
   521104     13032 134359988  0.00  0.00    0.09    0.19    0    0  (Gen:  0)
   520972      1344 134360752  0.00  0.00    0.09    0.19    0    0  (Gen:  0)
   521100       828 134360684  0.00  0.00    0.10    0.19    0    0  (Gen:  0)
   520812       592 134360528  0.00  0.00    0.10    0.19    0    0  (Gen:  0)
   520936      1344 134361324  0.00  0.00    0.10    0.19    0    0  (Gen:  0)
   520788      1480 134361476  0.00  0.00    0.10    0.20    0    0  (Gen:  0)
134438548      5964 268673908  0.00  0.00    0.19    0.38    0    0  (Gen:  0)
   586300      3084 268667168  0.00  0.00    0.19    0.38    0    0  (Gen:  0)
   517840       952 268666340  0.00  0.00    0.19    0.38    0    0  (Gen:  0)
   520920       544 268666164  0.00  0.00    0.19    0.38    0    0  (Gen:  0)
   520780       428 268666048  0.00  0.00    0.19    0.38    0    0  (Gen:  0)
   520820      2908 268668524  0.00  0.00    0.19    0.38    0    0  (Gen:  0)
   520732      1788 268668636  0.00  0.00    0.19    0.39    0    0  (Gen:  0)
   521076       564 268668492  0.00  0.00    0.19    0.39    0    0  (Gen:  0)
   520532       712 268668640  0.00  0.00    0.19    0.39    0    0  (Gen:  0)
   520764       956 268668884  0.00  0.00    0.19    0.39    0    0  (Gen:  0)
   520816       420 268668348  0.00  0.00    0.20    0.39    0    0  (Gen:  0)
   520948      1332 268669260  0.00  0.00    0.20    0.39    0    0  (Gen:  0)
   520784       616 268668544  0.00  0.00    0.20    0.39    0    0  (Gen:  0)
   521416       836 268668764  0.00  0.00    0.20    0.39    0    0  (Gen:  0)
   520488      1240 268669168  0.00  0.00    0.20    0.40    0    0  (Gen:  0)
   520824      1608 268669536  0.00  0.00    0.20    0.40    0    0  (Gen:  0)
   520688      1276 268669204  0.00  0.00    0.20    0.40    0    0  (Gen:  0)
   520252      1332 268669260  0.00  0.00    0.20    0.40    0    0  (Gen:  0)
   520672      1000 268668928  0.00  0.00    0.20    0.40    0    0  (Gen:  0)
134553500      5640 402973292  0.00  0.00    0.29    0.58    0    0  (Gen:  0)
   586776      2644 402966160  0.00  0.00    0.29    0.58    0    0  (Gen:  0)
   518064     26784 134342772  0.00  0.00    0.29    0.58    0    0  (Gen:  1)
   520828      3120 134343528  0.00  0.00    0.29    0.59    0    0  (Gen:  0)
   521108       756 134342668  0.00  0.00    0.30    0.59    0    0  (Gen:  0)

Here it seems we have 'live bytes' exceeding ~128MB.

The +RTS -hy profile basically just says we allocate 128MB:

http://imageshack.us/a/img69/7765/45q8.png

I tried reproducing this behavior in a simpler program, but even with replicating the exact setup with ST, a Reader containing the Vector, same monad/program structure etc. the simple test program doesn't show this. Simplifying my big program the behavior also stops eventually when removing apparently completely unrelated code.

Qs:

  • Am I really keeping this vector around 4 times out of 20?
  • If yes, how do I actually tell since +RTS -Hy and maximum residency claim I'm not, and what can I do to stop this behavior?
  • If no, why is Haskell not GC'ing it and running out of address space / memory, and what can I do to stop this behavior?

Thanks!

4

1 回答 1

2

我怀疑这是 GHC 和/或 RTS 中的错误。

首先,我确信没有实际的空间泄漏或类似情况。

原因:

  • 向量从不在任何地方使用。不读,不写,不参考。一旦 runST 完成,就应该收集它。即使 ST 计算返回单个 Int 并立即打印出来以对其进行评估,内存问题仍然存在。没有参考该数据。
  • RTS 提供的每种分析模式都强烈一致认为我实际上从未分配/引用超过一个向量的内存价值。每个统计数据和漂亮的图表都说明了这一点。

现在,这是有趣的一点。如果我在每次运行我的函数后通过调用手动强制 GC System.Mem.performGC,问题就会完全消失。

所以我们有一个案例,运行时有 GB 的内存(显然!)可以被 GC 回收,甚至根据它自己的统计数据,任何人都不再持有。当内存池用完时,运行时不会收集,而是要求操作系统提供更多内存。即使最终失败,运行时仍然不会收集(这显然会回收 GB 的内存),而是选择以内存不足错误终止程序。

我不是 Haskell、GHC 或 GC 方面的专家。但这在我看来确实很糟糕。我会将此报告为错误。

于 2013-08-17T15:18:25.093 回答