1

我想知道是否有人可以建议我如何在 Heroku 的后台进程中追踪内存泄漏/问题。

我有一个运行延迟作业队列的测功机,处理各种不同的进程。不时地,我消耗的内存会突然跳跃。随后的作业超出了内存限制并失败,所有的地狱都崩溃了。

奇怪的是我看不到内存的跳跃与任何特定的工作有关。这是我看到的那种日志:

Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=load_avg_1m val=0.00 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=load_avg_5m val=0.01 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=load_avg_15m val=0.01 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_total val=133.12 units=MB 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_rss val=132.23 units=MB 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_cache val=0.88 units=MB 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_swap val=0.01 units=MB 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_pgpgin val=0 units=pages 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_pgpgout val=45325 units=pages 
Aug 15 07:13:25 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=diskmbytes val=0 units=MB 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=load_avg_1m val=0.15 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=load_avg_5m val=0.07 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=load_avg_15m val=0.17 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=memory_total val=110.88 units=MB 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=memory_rss val=108.92 units=MB 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=memory_cache val=1.94 units=MB 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=memory_swap val=0.01 units=MB 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=memory_pgpgin val=2908160 units=pages 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=memory_pgpgout val=42227 units=pages 
Aug 15 07:13:31 vemmleads heroku/web.1:  source=heroku.10054113.web.1.bf5d3fae-2b1b-4e1d-a974-01d9fa4644db measure=diskmbytes val=0 units=MB 
Aug 15 07:13:35 vemmleads app/heroku-postgres:  source=HEROKU_POSTGRESQL_CHARCOAL measure.current_transaction=1008211 measure.db_size=482260088bytes measure.tables=39 measure.active-connections=6 measure.waiting-connections=0 measure.index-cache-hit-rate=0.99996 measure.table-cache-hit-rate=1 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=load_avg_1m val=0.00 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=load_avg_5m val=0.00 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=load_avg_15m val=0.14 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=memory_total val=108.00 units=MB 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=memory_rss val=107.85 units=MB 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=memory_cache val=0.15 units=MB 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=memory_swap val=0.01 units=MB 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=memory_pgpgin val=0 units=pages 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=memory_pgpgout val=33609 units=pages 
Aug 15 07:13:45 vemmleads heroku/run.2472:  source=heroku.10054113.run.2472.e811164e-4413-4dcf-8560-1f998f2c2b4e measure=diskmbytes val=0 units=MB 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=load_avg_1m val=0.30 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=load_avg_5m val=0.07 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=load_avg_15m val=0.04 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_total val=511.80 units=MB 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_rss val=511.78 units=MB 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_cache val=0.00 units=MB 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_swap val=0.02 units=MB 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_pgpgin val=27303936 units=pages 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=memory_pgpgout val=154826 units=pages 
Aug 15 07:13:46 vemmleads heroku/worker.1:  source=heroku.10054113.worker.1.4589e3f4-8208-483a-a927-67c4c1cbee46 measure=diskmbytes val=0 units=MB 

worker.1 的内存使用量从 108.00 跃升至 551.80,原因不明。看起来在那段时间里没有处理任何工作,所以很难理解那巨大的记忆是从哪里来的。稍后在日志中,worker1 达到内存限制并失败。

我正在运行 NewRelic Pro。它根本没有帮助 - 事实上它甚至不会为重复的内存错误创建警报。上面的 Heroku 日志没有给我更多信息。

任何关于下一步调查的想法或指示将不胜感激。

谢谢

西蒙

4

1 回答 1

1

这里没有足够的信息来确定发生了什么。

在 Rails 应用程序中(尤其是在异步后台作业中)内存泄漏的最常见原因是无法以增量方式迭代大型数据库集合。例如,使用如下语句加载所有用户记录User.all

例如,如果您有一个后台作业正在遍历User数据库中的每条记录,您应该使用User.find_each()User.find_in_batches()以块的形式处理这些记录(ActiveRecord 的默认值为 1000)。

这限制了加载到内存中的工作对象集,同时仍处理所有记录。

您应该寻找可能加载大量对象的无界数据库查找。

于 2013-08-15T15:29:22.657 回答