java - Java PubMed 阻止 url 请求

Question

我有一些代码可以访问 PubMed 中的文章并解析每个 XML 中的一些信息。该程序在我的计算机上运行良好，但需要很多时间才能完成。因此，当我在 unix 机器上运行它时，特别是针对这类事情，我发出的每个请求都会被阻止。在机器认为它是病毒之前，每分钟可以发出多少个限制，但这不是问题，因为所有请求都被阻止。我查了一下，这只发生在 PubMed 网站上的请求中。

提前致谢

编辑：我使用 jsoup 进行连接。从我的程序中使用 ProcessBuilder 运行 wget 不会被阻塞，但是效率成为一个问题，因为 wget 的输出只能被读取，while(br.readline() != null)并且这会占用大量运行时间。

score 1 · Accepted Answer

Instead of accessing pubmed webpage you can try to connect via other API, dedicated to retrieving pubmed data, like RESTful Web Service: http://europepmc.org/RestfulWebService. It allows you to get all the data you need in xml format and I think that there is no limit for number of queries.

For instance if you want to get all the information about article with pubmed_id=9481671, you need to access webpage: http://www.ebi.ac.uk/europepmc/webservices/rest/search/resulttype=core&query=ext_id:9481671.

java - Java PubMed 阻止 url 请求

1 回答 1

Related

Reference