python 下载pubmed数据

最新推荐文章于 2025-03-01 00:42:53 发布

huang714

最新推荐文章于 2025-03-01 00:42:53 发布

阅读量1.1k

点赞数

分类专栏： python

本文链接：https://blog.youkuaiyun.com/huang714/article/details/107762793

版权

python 专栏收录该内容

41 篇文章

订阅专栏

本文介绍了一种从PubMed数据库批量下载学术论文元数据的方法，通过Python的requests库发送POST请求，利用NCBI的E-utilities接口获取指定日期范围内的所有记录，并将数据存储为JSON文件。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

import requests
import json

search_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&mindate=1800/01/01&maxdate=2016/12/31&usehistory=y&retmode=json"
search_r = requests.post(search_url)
search_data = search_r.json()
webenv = search_data["esearchresult"]['webenv']
total_records = int(search_data["esearchresult"]['count'])
fetch_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmax=9999&query_key=1&webenv="+webenv

for i in range(0, total_records, 10000):
this_fetch = fetch_url+"&retstart="+str(i)
print("Getting this URL: "+this_fetch)
fetch_r = requests.post(this_fetch)
f = open('pubmed_batch_'+str(i)+'_to_'+str(i+9999)+".json", 'w')
f.write(fetch_r.text)
f.close()

print("Number of records found :"+str(total_records))