我在使用 Node.js 运行的程序中有一个 for 循环。该函数是xray包中的 x() ,我使用它从网页抓取和接收数据,然后将该数据写入文件。该程序在用于刮〜100页时成功,但我需要刮〜10000页。当我尝试抓取大量页面时,会创建文件但它们不包含任何数据。我相信这个问题的存在是因为 for 循环在继续下一次迭代之前没有等待 x() 返回数据。
有没有办法让节点在继续下一次迭代之前等待 x() 函数完成?
//takes in file of urls, 1 on each line, and splits them into an array.
//Then scrapes webpages and writes content to a file named for the pmid number that represents the study
//split urls into arrays
var fs = require('fs');
var array = fs.readFileSync('Desktop/formatted_urls.txt').toString().split("\n");
var Xray = require('x-ray');
var x = new Xray();
for(i in array){
//get unique number and url from the array to be put into the text file name
number = array[i].substring(35);
url = array[i];
//use .write function of x from xray to write the info to a file
x(url, 'css selectors').write('filepath' + number + '.txt');
}
注意:我正在抓取的某些页面不返回任何值