1

我从 Nucleotide db 下载完整记录时遇到问题。我用:

from Bio import Entrez
from Bio import SeqIO

with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as handle:
    seq_record = SeqIO.read(handle, "gb") 

print(seq_record)

这给了我一个简短版本的 gb 文件,所以命令:

seq_record.features

不返回功能。

相比之下,当我用 GenBank ID 做同样的事情时没有问题:

with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="CP014768.1") as handle:
    seq_record = SeqIO.read(handle, "gb") 

print(seq_record)

之后,我可以从列表 seq_record.features 中提取每个带注释的特征。

有没有办法使用 Efetch 下载完整的 RefSeq 记录?

4

1 回答 1

2

您需要使用style="withparts"或更改rettypegbwithparts获取所有功能。这个有一些信息。

>>> from Bio import Entrez
>>> from Bio import SeqIO
>>> Entrez.email = 'someone@email.com'
>>> with Entrez.efetch(db="nuccore", rettype="gb", retmode="full", id="NC_007384") as handle:
...     seq_record = SeqIO.read(handle, "gb") 
... 
>>> len(seq_record.features)
1
>>> with Entrez.efetch(db="nuccore", rettype="gbwithparts", retmode="full", id="NC_007384") as handle:
...     seq_record = SeqIO.read(handle, "gb") 
... 
>>> len(seq_record.features)
10616
>>> with Entrez.efetch(db="nuccore", rettype="gb", style="withparts", retmode="full", id="NC_007384") as handle:
...     seq_record = SeqIO.read(handle, "gb")
... 
>>> len(seq_record.features)
10616
于 2019-03-28T16:14:06.793 回答