python - How to create genbank flat file

Question

I am having hard time creating a genbank flat file using Biopython SeqIO (into something like http://www.ncbi.nlm.nih.gov/nuccore/CP003206) I was able to create a genbank by doing

simple_seq = Seq(row[15],IUPAC.unambiguous_dna)
simple_seq_r = SeqRecord(simple_seq)
simple_seq_r.id=row[0]
simple_seq_r.description= 'hello' 
SeqIO.write([seqrecord],'out.gbk', "gb")

But I was unable to write to the following fields because seqrecord does not have fields for these: KEYWORDS SOURCE
DBLINK ORGANISM
FEATURES
Location/Qualifiers

Would you know how do this? Thanks

score 0 · Accepted Answer

SeqRecord 类应具有以下属性中的这些字段：

dbxrefs包含一个带有数据库交叉引用 (DBLINK) 的字符串：'BioProject:PRJNA42399'。
annotations是另一个字典，包含许多值，包括关键字 (annotations['keywords'])，例如：comment、taxonomy、organism、accesions。
features包含作为 SeqFeature 类实例列表的功能。

有关更多信息，您可以阅读 SeqRecord 类 wiki：http ://biopython.org/wiki/SeqRecord和 SeqFeature 参考页面：http ://biopython.org/DIST/docs/api/Bio.SeqFeature.SeqFeature-class .html

您可以做的另一件事是保存您提供的这个 genbank 文件并使用 SeqIO 读取它，然后使用 dir() 查看哪些是您可以使用的实际属性，对于存储为字典的属性，它很有用看钥匙。像这样的东西（其中 my_file.gbk 包含您提供的文件的子序列）：

my_record = SeqIO.read('my_file.gbk', 'gb')
print "DBXREFS: ", my_record.dbxrefs
print "ANNOTATIONS: ", my_record.annotations.keys()
print "FEATURES: ", my_record.features

将为您提供有关您提供的文件的更多信息：

DBXREFS:  ['BioProject:PRJNA42399 BioSample:SAMN02603066']
ANNOTATIONS:  ['comment', 'sequence_version', 'source', 'taxonomy', 'keywords', 'references', 'accessions', 'data_file_division', 'date', 'organism', 'gi']
FEATURES:  [SeqFeature(FeatureLocation(ExactPosition(0), ExactPosition(1001), strand=1), type='source'), SeqFeature(FeatureLocation(BeforePosition(0), ExactPosition(471), strand=1), type='gene'), SeqFeature(FeatureLocation(BeforePosition(0), ExactPosition(471), strand=1), type='CDS')]

python - How to create genbank flat file

1 回答 1

Related

Reference