我正在使用@feathers/mongodb 将 500k 条目从 csv 文件加载到 mongo。
在一个钩子中,我正在收集所有行,稍微操作它们并插入整个数组
让 data_to_insert = [];
let element = function () {
this.member1 = '';
this.member2 = '';
this.member3 = '';
this.member4 = '';
this.member5 = '';
this.member6 = '';
};
// Read File
let content = await csv({
delimiter: ';',
})
.fromFile(pathCsv) // 120MB file, 500k entries
.subscribe((line) => {
let t_elem = new element();
t_elem.member1 = roundMinutes(line.member1); //returns date
t_elem.member2 = line.member2;
t_elem.member3 = line.member3;
t_elem.member4 = line.member4;
t_elem.member5 = new Date(
+line.member5 * 1000,
); // JS: timestamp * 1000
t_elem.member6 = new Date(
+line.member6 * 1000,
);
data_to_insert.push(t_elem);
});
// store the list
context.app.service('api/myservice').create(data_to_insert);
// all entries are written in the db. Heap grows afterwards
return;
到蒙戈。它运行良好,数据在大约 10 秒内写入数据库。
但是,我注意到 pm2 堆增长到 8GB,然后进程运行到内存不足。我想知道为什么之后会发生这种情况。这可能与触发的 500k 创建事件有关吗?
尤其是弦乐很大。如果我检查它们,我可以看到一些占用大部分内存的“事件字符串”:
如果我再次尝试调用钩子,它就会卡住。我(或 PM2)必须重新启动该过程才能使其重新启动并运行:
PM2 | [PM2][WORKER] Process 0 restarted because it exceeds --max-memory-restart value (current_memory=8904929280 max_memory_limit=8589934592 [octets])
PM2 | Process 0 in a stopped status, starting it
PM2 | Stopping app:Collector id:0
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | pid=53621 msg=failed to kill - retrying in 100ms
PM2 | App [Collector:0] exited with code [0] via signal [SIGINT]
PM2 | pid=53621 msg=process killed
PM2 | App [Collector:0] starting in -fork mode-
如果这与事件有关,如何为这种插入停用它?如果没有,应该在这段代码中优化什么来避免堆中的这种膨胀?
谢谢你的帮助。