-5

我有两个清单:

list1=[[['B', 10], 1], [['C', 15], 1], [['F', 30], 1]]
list2=[[['G', 20], 2], [['D', 25], 1]]

现在[['B', 10], 1]]inlist1与 in 匹配,[['D', 25], 1]]因为list2它们具有相等的第二个元素([1][1],第一个匹配)。

我想说['B', 10]['D', 25]被修改和[['C', 15], 1][['F', 30], 1]删除list1[['G', 20], 2]被添加到list2

有人可以在这里帮助我吗?我已经尝试过列表的集合和迭代,并且比较它们都不起作用。

注意:这不是作业,只是我想知道列表是如何工作的。

4

1 回答 1

0

你可以尝试这样的事情:

list1=[[['B', 10], 1], [['C', 15], 1], [['F', 30], 1]]
list2=[[['G', 20], 2], [['D', 25], 1]]

list2_iter = iter(list2)
for item1 in list1:
    in_list2 = False
    for item2 in list2_iter:
        if item1[1] == item2[1]:
            print "Match" if item1 == item2 else "Modified", item1, item2
            in_list2 = True
            break
        else:
            print "Inserted", item2
    if not in_list2:
        print "Deleted", item1

请注意,内部循环 forlist2使用的是迭代器,因此它不是从list2每次的开头开始,而是从它在外部循环的上一次迭代中停止的最后一个元素开始。其他一切都相当简单。

输出:

Inserted [['G', 20], 2]
Modified [['B', 10], 1] [['D', 25], 1]
Deleted [['C', 15], 1]
Deleted [['F', 30], 1]

请注意,这并不总是能找到最佳匹配,因为它只会通过两者,list1list2。如需更完整的算法,请查看Diff and Longest Common Subsequence


更新:我刚刚意识到这个问题实际上与找到两个字符串的最小编辑距离几乎相同,为此我碰巧有一些代码。经过一些概括:

import operator

def diff(s1, s2, match=operator.eq, neutral="*", subst=2):
    """s1, s2: the two strings or lists to match
    match: function to determine whether two elements match up
    neutral: 'neutral' element for padding
    subst: substitution costs
    """
    s1, s2 = neutral + s1, neutral + s2

    # calculate edit distance / sequence match with DP
    A = [[0] * len(s2) for i in range(len(s1))]
    for i in range(len(s1)):
        for k in range(len(s2)):
            if min(i, k) == 0:
                A[i][k] = max(i, k)
            else:
                diag = 0 if match(s1[i], s2[k]) else subst
                A[i][k] = min(A[i-1][k-1] + diag,
                              A[i  ][k-1] + 1,
                              A[i-1][k  ] + 1)

    # reconstruct path
    path, i, k = [], len(s1)-1, len(s2)-1
    while i or k:
        if A[i][k] == A[i-1][k] + 1:
            path, i = [-1] + path, i-1
        elif A[i][k] == A[i][k-1] + 1:
            path, k = [1] + path, k-1
        else:
            path, i, k = [0] + path, i-1, k-1

    return A[len(s1)-1][len(s2)-1], path


def print_match(list1, list2, path):
    i1, i2 = iter(list1), iter(list2)
    for p in path:
        if p == -1: print "DEL %20r"      %  next(i1)
        if p ==  0: print "EQ  %20r %20r" % (next(i1), next(i2))
        if p == +1: print "INS %41r"      %            next(i2)

# with strings
word1, word2 = "INTENTION", "EXECUTION"
x, path = diff(word1, word2)
print_match(word1, word2, path)

# with your lists of lists
list1 = [[['B', 10], 1], [['C', 15], 1], [['F', 30], 1]]
list2 = [[['G', 20], 2], [['D', 25], 1]]
x, path = diff(list1, list2, match=lambda x, y: x[1] == y[1], neutral=[[None, -1]])
print_match(list1, list2, path)

请注意,生成的匹配仍然可能与您的预期不同,因为通常有许多相同的最佳方式来匹配元素,但它仍然应该是最佳的。

于 2013-07-18T08:13:44.970 回答