I am relatively new to Node.js and I am trying to get more familiar with it by writing a simple module. The module's purpose is take an id, scrape a website and return an array of dictionaries with the data.
The data on the website is scattered across pages whereas every page is accessed by a different index number in the URI. I've defined a function that takes the id
and page_number
, scrapes the website via http.request()
for this page_number
and on end
event the data is passed to another function that applies some RegEx to get the data in a structured way.
In order for the module to have complete functionality, all the available page_nums
of the website should be scraped.
Is it ok by Node.js style/philosophy to create a standard for() loop to call the scraping function for every page, aggregate the results of every return and then return them all in once from the exported function?
EDIT
I figured out a solution based on help from #node.js on freenode. You can find the working code at http://github.com/attheodo/katina_node
Thank you all for the comments.