这是完整的存储库。这是一个非常简单的测试,它使用 postgresql-simple 数据库绑定将 50000 个随机事物插入到数据库中。它使用 MonadRandom 并且可以懒惰地生成事物。
这是 case1和使用事物生成器的特定代码片段:
insertThings c = do
  ts <- genThings
  withTransaction c $ do
    executeMany c "insert into things (a, b, c) values (?, ?, ?)" $ map (\(Thing ta tb tc) -> (ta, tb, tc)) $ take 50000 ts
这是 case2,它只是将事物转储到标准输出:
main = do
  ts <- genThings
  mapM print $ take 50000 ts
在第一种情况下,我的 GC 时间非常糟糕:
cabal-dev/bin/posttest +RTS -s       
   1,750,661,104 bytes allocated in the heap
     619,896,664 bytes copied during GC
      92,560,976 bytes maximum residency (10 sample(s))
         990,512 bytes maximum slop
             239 MB total memory in use (0 MB lost due to fragmentation)
                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      3323 colls,     0 par   11.01s   11.46s     0.0034s    0.0076s
  Gen  1        10 colls,     0 par    0.74s    0.77s     0.0769s    0.2920s
  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.97s  (  3.86s elapsed)
  GC      time   11.75s  ( 12.23s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   14.72s  ( 16.09s elapsed)
  %GC     time      79.8%  (76.0% elapsed)
  Alloc rate    588,550,530 bytes per MUT second
  Productivity  20.2% of total user, 18.5% of total elapsed
在第二种情况下,时间很好:
cabal-dev/bin/dumptest +RTS -s > out
   1,492,068,768 bytes allocated in the heap
       7,941,456 bytes copied during GC
       2,054,008 bytes maximum residency (3 sample(s))
          70,656 bytes maximum slop
               6 MB total memory in use (0 MB lost due to fragmentation)
                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      2888 colls,     0 par    0.13s    0.16s     0.0001s    0.0089s
  Gen  1         3 colls,     0 par    0.01s    0.01s     0.0020s    0.0043s
  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    2.00s  (  2.37s elapsed)
  GC      time    0.14s  (  0.16s elapsed)
  RP      time    0.00s  (  0.00s elapsed)
  PROF    time    0.00s  (  0.00s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    2.14s  (  2.53s elapsed)
  %GC     time       6.5%  (6.4% elapsed)
  Alloc rate    744,750,084 bytes per MUT second
  Productivity  93.5% of total user, 79.0% of total elapsed
我曾尝试应用堆分析,但什么都不懂。看起来所有 50000 个事物都是先在内存中构建的,然后通过查询转换为 ByteStrings,然后将这些字符串发送到数据库。但为什么会发生呢?如何确定有罪代码?
GHC 版本是 7.4.2
所有库和包本身的编译标志为 -O2(由沙箱中的 cabal-dev 编译)