我有一个看起来像这样的文件:
ATOM 7748 CG2 ILE A 999 53.647 54.338 82.768 1.00 82.10 C
ATOM 7749 CD1 ILE A 999 51.224 54.016 84.367 1.00 83.16 C
ATOM 7750 N ASN A1000 55.338 57.542 83.643 1.00 80.67 N
ATOM 7751 CA ASN A1000 56.604 58.163 83.297 1.00 80.45 C
ATOM 7752 C ASN A1000 57.517 58.266 84.501 1.00 80.30 C
如您所见,“”在第 4 列和第 5 列之间消失(从 0 开始计数)。因此下面的代码失败了。我是 python 新手(总时间现在整整 3 天!)并且想知道处理这个问题的最佳方法是什么。只要有空格, line.split() 就可以工作。我是否必须进行字符计数,然后使用绝对引用解析字符串?
import string
visited = {}
outputfile = open(file_output_location, "w")
for line in open(file_input_location, "r"):
list = line.split()
id = list[0]
if id == "ATOM":
type = list[2]
if type == "CA":
residue = list[3]
if len(residue) == 4:
residue = residue[1:]
type_of_chain = list[4]
atom_count = int(list[5])
position = list[6:9]
if(atom_count >= 1):
if atom_count not in visited and type_of_chain == chain_required:
visited[atom_count] = 1
result_line = " ".join([residue,str(atom_count),type_of_chain," ".join(position)])
print result_line
print >>outputfile, result_line
outputfile.close()