0

这个问题在逗我:

我有 6 个不同的序列,每个序列都重叠,它们的名称为 1-6。我制作了一个表示字典中序列的函数,以及一个给出重叠部分序列的函数。

现在我应该使用这两个函数来构建一个字典,该字典以从右到左的顺序和从左到右的顺序获取重叠位置的数量。

我制作的字典看起来像:

{'1': 'GGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTCGTCCAGACCCCTAGC',
 '2': 'CTTTACCCGGAAGAGCGGGACGCTGCCCTGCGCGATTCCAGGCTCCCCACGGG',
 '3': 'GTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGTCGTGAACACATCAGT',
 '4': 'TGCGAGGGAAGTGAAGTATTTGACCCTTTACCCGGAAGAGCG',
 '5': 'CGATTCCAGGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTC',
 '6': 'TGACAGTAGATCTCGTCCAGACCCCTAGCTGGTACGTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGT'}

我最终应该得到如下结果:

{'1': {'3': 0, '2': 1, '5': 1, '4': 0, '6': 29},
'3': {'1': 0, '2': 0, '5': 0, '4': 1, '6': 1},
'2': {'1': 13, '3': 1, '5': 21, '4': 0, '6': 0},
'5': {'1': 39, '3': 0, '2': 1, '4': 0, '6': 14},
'4': {'1': 1, '3': 1, '2': 17, '5': 2, '6': 0},
'6': {'1': 0, '3': 43, '2': 0, '5': 0, '4': 1}}

我似乎不可能。我想不是,所以如果有人可以(不这样做)但将我推向正确的方向,那就太好了。

4

2 回答 2

2

这有点复杂,但它应该可以工作。用作find_overlaps()发现重叠的函数和seq_dict序列的原始字典:

overlaps = {seq:{other_seq:find_overlaps(seq_dict[seq],seq_dict[other_seq])
    for other_seq in seq_dict if other_seq != seq} for seq in seq_dict}

这是一个更好的间距:

overlaps = \
{seq:
    {other_seq:
        find_overlaps(seq_dict[seq],seq_dict[other_seq])
    for other_seq in seq_dict if other_seq != seq}
for seq in seq_dict}
于 2013-01-02T22:26:15.143 回答
1

干净的方式:

dna = {
    '1': 'GGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTCGTCCAGACCCCTAGC',
    '2': 'CTTTACCCGGAAGAGCGGGACGCTGCCCTGCGCGATTCCAGGCTCCCCACGGG',
    '3': 'GTCTTCAGTAGAAAATTGTTTTTTTCTTCCAAGAGGTCGGAGTCGTGAACACATCAGT',
    '4': 'TGCGAGGGAAGTGAAGTATTTGACCCTTTACCCGGAAGAGCG',
    '5': 'CGATTCCAGGCTCCCCACGGGGTACCCATAACTTGACAGTAGATCTC',
    '6': 'TGACAGTAGATCTCGTCCAGACCCCTAGCTGGTACGTCTTCAGTAGAAAATTG' \
         'TTTTTTTCTTCCAAGAGGTCGGAGT'
}

def overlap(a, b):
    l = min(len(a), len(b))
    while True:
        if a[-l:] == b[:l] or l == 0:
            return l
        l -= 1

def all_overlaps(d):
    result = {}
    for k1, v1 in d.items():
        overlaps = {}
        for k2, v2 in d.items():
            if k1 == k2:
                continue
            overlaps[k2] = overlap(v1, v2)
        result[k1] = overlaps
    return result

print all_overlaps(dna)

(顺便说一句,您本可以overlap在问题中提供自己,以使每个人都更容易回答。)

于 2013-01-02T22:38:07.683 回答