python - Python: Uniquefying a list with a twist

Question

Lets say I have a list:

L = [15,16,57,59,14]

The list contains mesurements, that are not very accurate: that is the real value of an element is +-2 of the recorded value. So 14,15 and 16 can have the same value. What I want to do is to uniquefy that list, taking into account the mesurement errors. The output should therefor be:

l_out = [15,57]

or

l_out = [(14,15,16),(57,59)]

I have no problem producing either result with a for loop. However, I am curious if there could be a more elegant solution. Ideas much appriciated.

score 5 · Accepted Answer

正如lazer 在评论中指出的那样，这里已经发布了一个类似的问题。使用集群模块解决我的问题是：

>>> from cluster import *
>>> L = [15,16,57,59,14]
>>> cl = HierarchicalClustering(L, lambda x,y: abs(x-y))
>>> cl.getlevel(2)
[[14, 15, 16], [57, 59]]

或（获取每个组的平均值的唯一列表）：

>>> [mean(cluster) for cluster in cl.getlevel(2)]
[15, 58]

score 2 · Accepted Answer

这是我在纯 Python 方法中执行此操作的方法：

s = sorted(L)
b = [i + 1 for i, (x, y) in enumerate(zip(s, s[1:])) if y > x + 2]
result = [s[i:j] for i, j in zip([None] + b, b + [None])]

这b是“中断”列表，即集群结束的索引。

score 2 · Accepted Answer

如果你想要标准的 lib python，itertool'sgroupby是你的朋友：

from itertools import groupby

L = [15,16,57,59,14]

# Stash state outside key function. (a little hacky).
# Better way would be to create stateful class with a __call__ key fn.
state = {'group': 0, 'prev': None}
thresh = 2

def _group(cur):
    """Group if within threshold."""
    if state["prev"] is not None and abs(state["prev"] - cur) > thresh:
        state["group"] += 1 # Advance group
    state["prev"] = cur
    return state["group"]

# Group, then drop the group key and inflate the final tuples.
l_out = [tuple(g) for _, g in groupby(sorted(L), key=_group)]

print l_out
# -> [(14, 15, 16), (57, 59)]

score -1 · Accepted Answer

For 循环是最简单的方法，但如果你真的想要一个单行代码：
l_out = list(set(tuple([tuple(filter(lambda i: abs(item - i) < 3, L)) for item in L])))
虽然很不清楚，我更喜欢 for 版本:)

python - Python: Uniquefying a list with a twist

4 回答 4

Related

Reference