python - 如何使用 Bio.Entrez 提取 PMC 文章标题和摘要的完整列表？

翻译自：https://stackoverflow.com/questions/64322120 2020-10-12T16:58:43.363

102 次

我正在尝试从 PMC/Pubmed 下载完整的标题/摘要数据。这是一个古老的问题，但 stackoverflow 上的答案似乎都没有回答。

一种通用的方法是使用 Entrez 包，但话又说回来，您需要指定搜索词。此外，您可以随时间发送的查询请求也有限制。

from Bio import Entrez
Entrez.email = "A.N.Other@example.com"  
handle = Entrez.esearch(db="pubmed", term="orchid", retmax=463)
record = Entrez.read(handle)
handle.close()
idlist = record["IdList"]
handle = Entrez.efetch(db="pubmed", id=idlist, rettype="medline", retmode="text")
records = Medline.parse(handle)

for record in records:
     print("title:", record.get("TI", "?"))
     print("authors:", record.get("AU", "?"))
     print("source:", record.get("SO", "?"))
     print("")

无论如何，我可以使用 Python 或直接从任何其他来源从 PMC 下载整篇文章+抽象数据吗？

python - 如何使用 Bio.Entrez 提取 PMC 文章标题和摘要的完整列表？

0 回答 0

Related

Reference