0

我有一个列表列表。每个子列表包含 2 项,并且对于 n 次出现,子列表的第二项相同。

我只想保留第一个子列表,因为价差是第一个中最大的。这是我所拥有的:

[[0, 3],
 [1, 3],
 [2, 3],
 [314, 335],
 [315, 335],
 [316, 335],
 [317, 335],
 [318, 335],
 [319, 335],
 [320, 335],
 [321, 335],
 [322, 335],
 [323, 335],
 [324, 335],
 [325, 335],
 [326, 335],
 [327, 335],
 [328, 335],
 [329, 335],
 [330, 335],
 [331, 335],
 [332, 335],
 [333, 335],
 [334, 335],
 [645, 647],
 [646, 647]]

我想保留:

[[0, 3],
[314, 335],
[645, 647]]

关于如何做到这一点的任何想法?

4

4 回答 4

1

这是一种方法。

前任:

seen = set()
result = []
for i in data:
    if i[1] not in seen:    #Check if second item in set
        result.append(i)    #Add to result
        seen.add(i[1])      #Add second item to set

print(result) #--> [[0, 3], [314, 335], [645, 647]]
于 2019-08-19T07:43:14.290 回答
1

itertools.groupby可用于:

from itertools import groupby

ret = [[next(group)[0], key] for key, group in groupby(lst, key=lambda x: x[1])]
# [[0, 3], [314, 335], [645, 647]]

我将子列表中的第二个元素用作key.

于 2019-08-19T07:44:10.993 回答
1

另一种方法是使用 pandas 数据框

import pandas as pd


df = pd.DataFrame(your_data)

df2 = df.drop_duplicates(1)

然后可以将其转换回列表的数据框。

于 2019-08-19T07:56:10.687 回答
0

该任务的itertools 文档中有现成的配方:

import itertools
def unique_everseen(iterable, key=None):
    "List unique elements, preserving order. Remember all elements ever seen."
    # unique_everseen('AAAABBBCCDAABBB') --> A B C D
    # unique_everseen('ABBCcAD', str.lower) --> A B C D
    seen = set()
    seen_add = seen.add
    if key is None:
        for element in filterfalse(seen.__contains__, iterable):
            seen_add(element)
            yield element
    else:
        for element in iterable:
            k = key(element)
            if k not in seen:
                seen_add(k)
                yield element

data = [[0, 3],
 [1, 3],
 [2, 3],
 [314, 335],
 [315, 335],
 [316, 335],
 [317, 335],
 [318, 335],
 [319, 335],
 [320, 335],
 [321, 335],
 [322, 335],
 [323, 335],
 [324, 335],
 [325, 335],
 [326, 335],
 [327, 335],
 [328, 335],
 [329, 335],
 [330, 335],
 [331, 335],
 [332, 335],
 [333, 335],
 [334, 335],
 [645, 647],
 [646, 647]]
out = list(unique_everseen(data,key=lambda x:x[1]))
print(out)

输出:

[[0, 3], [314, 335], [645, 647]]
于 2019-08-19T08:08:53.557 回答