我编写了以下代码来定义文本文件中的 4 行块,如果块的第 2 行仅由一种类型的字符组成,则输出该块。假设(并且之前已验证)第二行始终由 36 个字符的字符串组成。
# filter out homogeneous reads
import sys
import collections
from collections import Counter
filename1 = sys.argv[1] # file to process
with open(filename1,'r') as input_file:
for line1 in input_file:
line2, line3, line4 = [next(input_file) for line in xrange(3)]
c = Counter(line2).values() # count characters in line2
c.sort(reverse=True) # sort values in descending order
if c[0] < 36:
print line1 + line2 + line3 + line4.rstrip()
但是,我收到如下 StopIteration 错误。如果有人能告诉我原因,我将不胜感激。
$ python code.py test.file > testout.file
Traceback (most recent call last):
File "code.py", line 11, in <module>
line2, line3, line4 = [next(input_file) for line in xrange(3)]
StopIteration
任何帮助将不胜感激,尤其是那些解释我的特定代码有什么问题以及如何修复它的帮助。这是一个输入示例:
@1:1:1323:1032:Y
AGCAGCATTGTACAGGGCTATCATGGAATTCTCGGG
+1:1:1323:1032:Y
HHHBHHBHBHGBGGGH8HHHGGGGFHBHHHHBHHHH
@1:1:1610:1033:Y
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
+1:1:1610:1033:Y
HHEHHHHHHHHHHHBGGD>GGD@G8GGGGDHBHH4C
@1:1:1679:1032:Y
CGGTGGATCACTCGGCTCGTGCGTCGATGAAGAACG