import.io - 非标准分页系统的import.io爬虫

Question

我正在尝试为此站点http://theaccelblog.squarespace.com/构建一个 import.io 爬虫，但是当我单击“下一步”进入下一页进行训练时，由于系统原因，它会将我带回第一页正在使用的分页。非常感谢有关如何让 import.io 爬虫爬取这些页面的任何建议。正如 import.io 网站上所建议的那样，我试图在与服务器交换的数据包中找到分页系统，但没有成功。谢谢，如果你能帮忙。JRH

score 0 · Accepted Answer

我使用批量提取来创建 API。

https://import.io/data/mine/?id=bc7d67f2-24d3-4b5c-b134-01544430998a

如果您使用下面的偏移分页，您可以将其输入到 Bulk 并获取您需要的数据。

http://theaccelblog.squarespace.com/?offset=1418833411427    
http://theaccelblog.squarespace.com/?offset=1409932229141    
http://theaccelblog.squarespace.com/?offset=1402342675828    
http://theaccelblog.squarespace.com/?offset=1397601000000    
http://theaccelblog.squarespace.com/?offset=1397511000000    
http://theaccelblog.squarespace.com/?offset=1390543200000    
http://theaccelblog.squarespace.com/?offset=1375383600000    
http://theaccelblog.squarespace.com/?offset=1359748800000    
http://theaccelblog.squarespace.com/?offset=1285959600000

谢谢，
梅格

import.io - 非标准分页系统的import.io爬虫

1 回答 1

Related