我在 Github 中有一个 CI 设置,它运行我目前拥有的所有测试(大约 200 个),这些测试是使用 jest 和 superagent 的 e2e 测试,遵循 NestJS 的标准测试模式,也意味着我为每个模块都有一个或多个测试套件(文件)
一切都是粉红色和美丽的,直到最近我不得不使用 BullMQ(也集成在框架中)实现一些队列。在这一点上,我的 PR 一直未能通过检查,甚至跳过了实际使用队列测试端点的测试套件,炸毁了内存,并使另一个理论上不相关的套件超时。这是最终的错误跟踪:
<--- Last few GCs --->
[2400:0x57d01a0] 287364 ms: Mark-sweep 1846.7 (2086.3) -> 1830.1 (2083.3) MB,
1710.5 / 9.9 ms (average mu = 0.173, current mu = 0.177) allocation failure scavenge
might not succeed
[2400:0x57d01a0] 289587 ms: Mark-sweep 1846.6 (2083.3) -> 1833.2 (2080.8) MB,
1649.9 / 0.1 ms (average mu = 0.216, current mu = 0.258) task scavenge might not
succeed
<--- JS stacktrace --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript
heap out of memory
1: 0xb00d90 node::Abort() [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
2: 0xa1823b node::FatalError(char const*, char const*)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
3: 0xcedbce v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
4: 0xcedf47 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char
const*, bool) [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
5: 0xea6105 [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
6: 0xea6be6 [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
7: 0xeb4b1e [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
8: 0xeb5560 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace,
v8::internal::GarbageCollectionReason, v8::GCCallbackFlags)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
9: 0xeb84de v8::internal::Heap::AllocateRawWithRetryOrFailSlowPath(int,
v8::internal::AllocationType, v8::internal::AllocationOrigin,
v8::internal::AllocationAlignment) [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
10: 0xe7990a v8::internal::Factory::NewFillerObject(int, bool,
v8::internal::AllocationType, v8::internal::AllocationOrigin)
[/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
11: 0x11f2f06 v8::internal::Runtime_AllocateInYoungGeneration(int, unsigned long*,
v8::internal::Isolate*) [/opt/hostedtoolcache/node/16.13.1/x64/bin/node].
12: 0x15e7819 [/opt/hostedtoolcache/node/16.13.1/x64/bin/node]
Aborted (core dumped)
除了使用我所说的已经集成在 NestJS 中的 BullMQ 模块/装饰器之外,自定义代码实现的增加很少,并且测试在本地通过很好
我相信问题是否与队列实现有关,并且现在我将 Redis 设置添加到 CI,如果我跳过使用此实现构建模块的测试,一切都应该没问题,但这一直在发生
尝试增加全局超时设置,但到目前为止没有任何改变
我一直想知道几天来试图找到根本原因,但一无所获
在这一点上感觉完全迷失了,有没有人知道这里可能发生什么,或者有任何线索知道如何寻找其根本原因?
提前致谢