I'm trying to write a scraper using 'request' and 'cheerio'. I have an array of 100 urls. I'm looping over the array and using 'request' on each url and then doing cheerio.load(body). If I increase i above 3 (i.e. change it to i < 3 for testing) the scraper breaks because var productNumber is undefined and I can't call split on undefined variable. I think that the for loop is moving on before the webpage responds and has time to load the body with cheerio, and this question: nodeJS - Using a callback function with Cheerio would seem to agree.
My problem is that I don't understand how I can make sure the webpage has 'loaded' or been parsed in each iteration of the loop so that I don't get any undefined variables. According to the other answer I don't need a callback, but then how do I do it?
for (var i = 0; i < productLinks.length; i++) {
productUrl = productLinks[i];
request(productUrl, function(err, resp, body) {
if (err)
throw err;
$ = cheerio.load(body);
var imageUrl = $("#bigImage").attr('src'),
productNumber = $("#product").attr('class').split(/\s+/)[3].split("_")[1]
console.log(productNumber);
});
};
Example of output:
1461536
1499543
TypeError: Cannot call method 'split' of undefined