1

我正在尝试训练一个非常需要内存的 Tensorflow.js 模型,但我遇到了一些非常奇怪的 Node.js 行为。

$ node --max-old-space-size=16384 --initial-old-space-size=16384 --max-heap-size=16384 --initial-heap-size=16384 index.js
2020-08-19 11:32:56.190067: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-08-19 11:32:56.213573: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3400000000 Hz
2020-08-19 11:32:56.216786: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x55fce50 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-08-19 11:32:56.217643: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version


<--- Last few GCs --->

[19219:0x44d06a0]   162448 ms: Scavenge 4047.5 (4054.3) -> 4047.5 (4054.3) MB, 26.0 / 0.0 ms  (average mu = 0.989, current mu = 0.989) allocation failure
[19219:0x44d06a0]   163150 ms: Scavenge 4430.0 (4436.8) -> 4430.0 (4436.8) MB, 36.0 / 0.0 ms  (average mu = 0.989, current mu = 0.989) allocation failure
[19219:0x44d06a0]   164237 ms: Scavenge 5003.8 (5010.6) -> 5003.8 (5010.6) MB, 51.5 / 0.0 ms  (average mu = 0.989, current mu = 0.989) allocation failure


<--- JS stacktrace --->

FATAL ERROR: invalid array length Allocation failed - JavaScript heap out of memory
 1: 0x9fd5f0 node::Abort() [node]
 2: 0x94a45d node::FatalError(char const*, char const*) [node]
 3: 0xb7099e v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
 4: 0xb70d17 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
 5: 0xd1a905  [node]
 6: 0xcf3445  [node]
 7: 0xe7b96e  [node]
 8: 0xe7baba  [node]
 9: 0x10185b3 v8::internal::Runtime_GrowArrayElements(int, unsigned long*, v8::internal::Isolate*) [node]
10: 0x13cc8f9  [node]
[1]    19219 abort (core dumped)  node --max-old-space-size=16384 --initial-old-space-size=16384   index.js

看来我不能强制节点拥有更多内存!即使在我的机器上目前安装了 32 GB,其中只占用了 17 GB,所以如果有的话,它应该占用至少 15 GB。然而它通常在分配 3-5GB 后就死掉了。

这是 Node.js 中的错误吗?也许是早期版本的回归?我有什么选择,或者我可能必须重新编译我自己的 Node 二进制文件?如果我这样做,那会奏效吗?

编辑:我运行 64 位节点。node -p "os.arch()"x64

4

0 回答 0