python - 删除列表中具有相似特征的项目

Question

假设我有一个嵌套列表，如下所示：

[[['a'],[24],214,1] ,[['b'],[24],312,1] ,[['a'],[24],3124,1] , [['c'],[24],34,1]]

item[2]并假设我想从列表中删除所有项目，除了在共享相同字母的项目中具有最大值的项目item[0]

因此，例如在前面的列表中，我有两个项目共享相同的字母item[0]：

[ ['a'],[24],214,1], [['a'],[24],3124,1] ]

我想删除前者，因为它的值较低item[2]。

输出列表应为：

[ [['b'],[24],312,1] ,[['a'],[24],3124,1] , [['c'],[24],34,1] ]

你能建议我一个紧凑的方法吗？

score 0 · Accepted Answer

如果返回的顺序无关紧要，可以尝试使用groupbyfromitertools将项目按第一个元素分组（按第一个元素排序后），然后用函数拉出最大值max（另外，需要注意的是，这个返回一个新列表，而不是就地修改）：

In [1]: from itertools import groupby

In [2]: l = [[['a'],[24],214,1] ,[['b'],[24],312,1] ,[['a'],[24],3124,1] , [['c'],[24],34,1]]

In [3]: result = []

In [4]: for k, g in groupby(sorted(l, key=lambda x: x[0]), key=lambda x: x[0]):
   ...:     result.append(max(g, key=lambda m: m[2]))
   ...:
   ...:

In [5]: result
Out[5]: [[['a'], [24], 3124, 1], [['b'], [24], 312, 1], [['c'], [24], 34, 1]]

稍微扩展一下，如果要保持原始顺序，可以l通过仅包含 are in 的那些项目进行修改results，这将保持顺序：

In [6]: l = [i for i in l if i in result]

In [7]: l
Out[7]: [[['b'], [24], 312, 1], [['a'], [24], 3124, 1], [['c'], [24], 34, 1]]

并将其组合成一个真正可憎的单线，你可以（但可能不应该:)）这样做：

In [10]: l = [[['a'],[24],214,1] ,[['b'],[24],312,1] ,[['a'],[24],3124,1] , [['c'],[24],34,1]]

In [11]: [i for i in l if i in [max(g, key=lambda m: m[2]) for k, g in groupby(sorted(l, key=lambda x: x[0]), key=lambda x: x[0])]]
Out[11]: [[['b'], [24], 312, 1], [['a'], [24], 3124, 1], [['c'], [24], 34, 1]]

score 0 · Accepted Answer

一些保留原始顺序的选项，仅删除比较器值低于最大值的任何项目。

def filter1(items):
    first = set(item[0][0] for item in items)
    compare = dict((f, max(item[2] for item in items if item[0][0] == f)) 
        for f in first)
    return  [item for item in items if item[2] >= compare[item[0][0]]]

def filter2(items):
    compare = {}
    for item in items:
        if ((item[0][0] in compare and item[2] > compare[item[0][0]])
            or (not item[0][0] in compare)):
            compare[item[0][0]] = item[2]
    return [item for item in items if item[2] >= compare[item[0][0]]]

def filter3(items):
    return [i for i in items if i[2] >= 
        max(j[2] for j in items if j[0][0]==i[0][0])]

如果您有一个大列表，filter3 最短但最慢。我猜 filter2 会是最快的。

score 0 · Accepted Answer

由于您的问题令人困惑，我已经给出了删除最大和最小元素的可能性

>>> def foo(some_list, fn = max):
    #Create a dictionary, default dict won;t help much as 
    #we have to refer back to the value for an existing key
    #The dictionary would have item[0] as key
    foo_dict = dict()
    #Iterate through the list
    for e in some_list:
            #Check if the key exist
        if e[0][0] in foo_dict:
                    #and if it does, find the max of the existing value and the 
                    #new element. The key here is the second item
            foo_dict[e[0][0]] = fn(foo_dict[e[0][0]], e, key = lambda e:e[2])
        else:
                    #else consider the new element as the current max
            foo_dict[e[0][0]] = e
    return foo_dict.values()

>>> foo(somelist)
[[['a'], [24], 3124, 1], [['c'], [24], 34, 1], [['b'], [24], 312, 1]]
>>> foo(somelist,min)
[[['a'], [24], 214, 1], [['c'], [24], 34, 1], [['b'], [24], 312, 1]]

python - 删除列表中具有相似特征的项目

3 回答 3

Related

Reference