0

im currently trying to use Node to monitor a webpage for changes. But the thing is that after some page loads, the memory usage of node.exe goes up incredibly fast, about 40 - 50 Mb at a time. I've determined that the issue comes from this part of my code:

var cheerio = require('cheerio');

function getPage () {
    http.get( 'some.url.com' , function (res) {
        var page = ''; 
        res.on('data', function (chunk) {
            page += chunk; //commenting THIS
        });
        res.on('end', function (err) {
            $ = cheerio.load(page); // and THIS makes the program run OK.
            event.emit('pageLoaded');
        });
    });
}

setInterval(getPage,40000);

I'm using Cheerio module to do some DOM manipulation, which seems to have the biggest impact on the memory usage. Is there a way to clear the data used completely for every function call?? Thanks.

4

3 回答 3

0

It looks like you're making $ into an "implied" global variable. Try adding changing $ = ... to var $ = ... and see if that improves things.

于 2013-09-27T00:39:02.673 回答
0

Try an array join instead of string concatenation to avoid so many unnecessary intermediate versions of chunk1, chunk1And2, chunk1Through3, chunk1Through4, etc. Store chunks in an array and join them after they are all loaded.

function getPage () {
    http.get( 'some.url.com' , function (res) {
        var chunks = []; 
        res.on('data', function (chunk) {
            chunks.push(chunk);
        });
        res.on('end', function (err) {
            $ = cheerio.load(chunks.join(''));
            event.emit('pageLoaded');
        });
    });
}

Is there a way to clear the data used completely for every function call??

No. JavaScript is garbage collected and does not have a user-land API to impact the garbage collector's behavior in any way. Just don't leak variables or hold references to objects you no longer need and that's all you can do.

于 2013-09-27T00:55:15.487 回答
0

I suspect you'd be happier with the request module rather than using raw http. It returns a stream that cheerio can parse. Here's an example: How to most efficiently parse a web page using Node.js

I suspect that switching would reduce your memory usage.

于 2013-09-27T02:01:28.283 回答