TL;DR: It's the library and not node which is an issue.
Long answer
Here is a slightly modified code
var heapdump = require('heapdump');
const fs = require('fs');
var libxmljs = require("libxmljs");
const content = fs.readFileSync('./html2.htm');
let id = 0;
class MyObject{
constructor(){
this.doc = libxmljs.parseHtml(content);
this.node = this.doc.root()
}
}
let obj;
function createObject () {
obj = new MyObject(content);
};
try {
for(var i = 0; i < 3000; i++){
createObject();
// if I uncomment the next line it works fine
// obj.node = null
console.log(i);
if (i === 50) {
heapdump.writeSnapshot('/Users/me/3.heapsnapshot');
}
if (i === 100) {
heapdump.writeSnapshot('/Users/me/4.heapsnapshot');
}
if (i === 150) {
heapdump.writeSnapshot('/Users/me/5.heapsnapshot');
}
}
console.log('done');
}
catch(e) {
console.log(e);
}
Below is the relevant section of the heapdump diff we took in the code (3 and 4)
And even clear when we look at 4 and 5 heapdump
Few thing that we can conclude from these heapdumps:
- There is no memory leak in the JS part.
- The size of the heapdump does not match the size of the process we see on htop/top/activity monitor depending on your OS. (12 MB of heapdump versus few Gb in RAM)
Heapdump will only give us memory leak which are in JS. Since this library has c code, heapdump will not capture leaks which will be there.
I am not sure how we can capture the dump from that library or why setting it to null allows the memory to be freed but it should be safe to assume that node gc is doing everything it can.
Hope this helps