python - 根据元素的比较从嵌套列表中删除项目（子列表）

Question

第一次在这里发帖，所以我希望我没有重复任何问题（不过我检查过）。

这是交易：

我有一个列表，包含 4 个元素子列表，例如[[10,1,3,6],[22,3,5,7],[2,1,4,7],[44,3,1,0]]

我想做的是：

1）删除所有第四个子元素等于零的元素，例如[44,3,1,0]（简单的部分）

2）删除具有相同第二个元素的项目，仅保留具有最大第一个元素的项目，例如[[10,1,3,6],[2,1,4,7]] -> [10,1,3,6]

我一直在尝试使用嵌套循环和第二个列表来获取我想要保留的元素的解决方案，但我似乎无法确定它。

我可以使用一个优雅的解决方案吗？

score 2 · Accepted Answer

如果 listA 是您的原始列表，而 listB 是您的新列表，似乎第 (2) 部分可以通过遍历 listA 来解决，检查当前元素（嵌套列表）是否包含重复的第二个元素，如果是，比较查看哪个嵌套列表保留在 listB 中的第一个元素。所以在伪代码中：

sizeOfListA = # whatever the original size is
sizeOfListB = 0

for i in (sizeOfListA):
  for j in (sizeOfListB):
    if listA[i][1] == listB[j][1]:  # check if second element is a duplicate
      if listA[i][0] > listB[j][0]: # check which has the bigger first element
        listB[j] = listA[i]
    else:   # if second element is unique, append nested list and increment size
      listB.append(listA[i])
      sizeOfListB += 1

这仅适用于第 (2) 部分。就像 Burhan 的评论一样，我确信有一种更优雅的方法可以做到这一点，但我认为这可以完成工作。此外，问题并没有说明当第一个元素相等时会发生什么，因此也需要考虑这一点。

score 2 · Accepted Answer

你可以使用itertools.groupby：

from itertools import groupby
from operator import itemgetter as ig

data = [[10,1,3,6],[22,3,5,7],[2,1,4,7],[44,3,1,0]]

# filter and sort by main key
valid_sorted = sorted((el for el in data if el[3] != 0), key=ig(1))
# ensure identical keys have highest first element first
valid_sorted.sort(key=ig(0), reverse=True)
# group by second element
grouped = groupby(valid_sorted, ig(1))
# take first element for each key
selected = [next(item) for group, item in grouped]
print selected
# [[22, 3, 5, 7], [10, 1, 3, 6]]

或使用dict：

d = {}
for el in valid_sorted: # doesn't need to be sorted - just excluding 4th == 0
    d[el[1]] = max(d.get(el[1], []), el)
print d.values()
# [[10, 1, 3, 6], [22, 3, 5, 7]]

score 1 · Accepted Answer

这是第二部分：

from itertools import product

lis = [[10, 1, 3, 6], [22, 3, 5, 7], [2, 1, 4, 7]]
lis = set(map(tuple, lis))   #create a set of items of lis
removed = set()             #it will store the items to be removed

for x, y in product(lis, repeat=2):
    if x != y:
        if x[1] == y[1]:
            removed.add(y if x[0] > y[0] else x)

print "removed-->",removed

print lis-removed       #final answer

输出：

removed--> set([(2, 1, 4, 7)])
set([(22, 3, 5, 7), (10, 1, 3, 6)])

score 1 · Accepted Answer

如果您不关心最终列表的顺序，您可以按第二项排序并使用生成器找到第一项的最大值：

l = [[10,1,3,6],[22,3,5,7],[2,1,4,7],[44,3,1,0]]

remove_zeros_in_last = filter(lambda x: x[3] != 0, l)

ordered_by_2nd = sorted(remove_zeros_in_last, key=lambda x: x[1])

def group_equal_2nd_by_largest_first(ll):
    maxel = None
    for el in ll:
        if maxel is None:
            maxel = el  # Start accumulating maximum
        elif el[1] != maxel[1]:
            yield maxel
            maxel = el
        elif el[0] > maxel[0]:
            maxel = el  # New maximum
    if maxel is not None:
        yield maxel     # Don't forget the last item!

print list(group_equal_2nd_by_largest_first(ordered_by_2nd))

# gives [[10, 1, 3, 6], [22, 3, 5, 7]]

python - 根据元素的比较从嵌套列表中删除项目（子列表）

4 回答 4

Related

Reference