1

首先,我必须指出我不是程序员,所以这可能是一个愚蠢的问题,但我想了解这里发生了什么。

程序应该通过一个字符串(基因组),并滑动一个任意长度的窗口(在本例中为“l”)。它搜索给定长度 (k) 的重复字符序列并记录序列的出现次数。我确实设法在整个字符串中找到重复的序列,但所说的窗口让我很困扰。我尝试使用嵌套循环:

for i in range(len(genome) - k + 1):
    for c in range(len(genome))[c:c+l]:
        kmer = genome[i:i+k]
        if kmer in d:
            d[kmer] += 1
        else:
            d[kmer] = 1

我收到一个错误:“NameError: name 'c' is not defined” 这个问题的原因是什么,是否有一种易于理解的解决方法?效率并不是很重要,所以我想保持一个类似的结构(我发现很多主题描述了避免使用嵌套 for 循环的方法,但我现在觉得它很混乱)。

先感谢您。

4

1 回答 1

4

You are defining c in the second for loop, and trying to use it in the same statement. Thus, c is not defined until you begin the for loop, so is not defined.

Edit

Judging by your comments, I believe what you are trying to do is slide a window of length l along a genome. Then you want to find the window that is enriched for some k-mer(s). To do that, I would modify your second loop to look at the next l locations from the current window start:

for c in range(i, i+l):
于 2013-11-12T20:25:57.103 回答