python - Python随机列表理解

Question

我有一个类似的列表：

[1 2 1 4 5 2 3 2 4 5 3 1 4 2]

我想从此列表中创建一个包含 x 个随机元素的列表，其中所有选择的元素都不相同。困难的部分是我想通过使用列表理解来做到这一点......所以如果 x = 3 可能的结果是：

[1 2 3]
[2 4 5]
[3 1 4]
[4 5 1]

ETC...

谢谢！

我应该指定我不能将列表转换为集合。对不起！我需要对随机选择的数字进行加权。因此，如果 1 在列表中出现 4 次，而 3 在列表中出现 2 次，则 1 被选中的可能性是其两倍...

score 11 · Accepted Answer

免责声明：“使用列表理解”的要求是荒谬的。

此外，如果您想使用权重，Eli Bendersky 的加权随机抽样页面上列出了许多出色的方法。

以下是低效的，不能扩展，等等等等。

也就是说，它不是一个而是两个（两个！）列表推导，返回一个列表，从不重复元素，并且在某种意义上尊重权重：

>>> s = [1, 2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2]
>>> [x for x in random.choice([p for c in itertools.combinations(s, 3) for p in itertools.permutations(c) if len(set(c)) == 3])]
[3, 1, 2]
>>> [x for x in random.choice([p for c in itertools.combinations(s, 3) for p in itertools.permutations(c) if len(set(c)) == 3])]
[5, 3, 4]
>>> [x for x in random.choice([p for c in itertools.combinations(s, 3) for p in itertools.permutations(c) if len(set(c)) == 3])]
[1, 5, 2]

.. 或者，由 FMc 简化：

>>> [x for x in random.choice([p for p in itertools.permutations(s, 3) if len(set(p)) == 3])]
[3, 5, 2]

（我会把它留x for x在里面，即使不简单地写list(random.choice(..))或者只是把它作为一个元组留下是很痛苦的......）

score 5 · Accepted Answer

一般来说，你不想在列表理解中做这种事情——这会导致代码更难阅读。然而，如果你真的必须，我们可以写一个完全可怕的 1 班轮：

>>> values = [random.randint(0,10) for _ in xrange(12)]
>>> values
[1, 10, 6, 6, 3, 9, 0, 1, 8, 9, 1, 2]
>>> # This is the 1 liner -- The other line was just getting us a list to work with.
>>> [(lambda x=random.sample(values,3):any(values.remove(z) for z in x) or x)() for _ in xrange(4)]
[[6, 1, 8], [1, 6, 10], [1, 0, 2], [9, 3, 9]]

请永远不要使用此代码——我只是出于娱乐/学术原因而发布它。

以下是它的工作原理：

我在列表推导中创建了一个函数，默认参数是从输入列表中随机选择的 3 个元素。在函数内部，我从中删除了元素，values以便它们无法再次被拾取。由于list.remove返回None，我可以使用any(lst.remove(x) for x in ...)删除值并返回False。由于return ，我们在调用函数时点击了仅返回的子句any（带有 3 个随机选择的项目的默认值）。剩下的就是调用函数并让魔法发生。Falseorx

这里的一个问题是您需要确保您请求的组数（这里我选择 4）乘以每个组的项目数（这里我选择 3）小于或等于您的值的数量输入列表。这似乎很明显，但无论如何可能值得一提......

这是我shuffle进入列表理解的另一个版本：

>>> lst = [random.randint(0,10) for _ in xrange(12)]
>>> lst
[3, 5, 10, 9, 10, 1, 6, 10, 4, 3, 6, 5]
>>> [lst[i*3:i*3+3] for i in xrange(shuffle(lst) or 4)]
[[6, 10, 6], [3, 4, 10], [1, 3, 5], [9, 10, 5]]

这比我的第一次尝试要好得多，但是，大多数人在弄清楚这段代码在做什么之前仍然需要停下来，摸摸头。我仍然断言在多行中执行此操作会更好。

score 2 · Accepted Answer

如果我正确理解您的问题，这应该有效：

def weighted_sample(L, x):
    # might consider raising some kind of exception of len(set(L)) < x

    while True:
        ans = random.sample(L, x)
        if len(set(ans)) == x:
            return ans

然后，如果您想要很多这样的样本，您可以执行以下操作：

[weighted_sample(L, x) for _ in range(num_samples)]

我很难理解对不仅仅是混淆的采样逻辑的理解。逻辑有点太复杂了。这听起来像是随机添加到我的家庭作业中。

如果您不喜欢无限循环，我还没有尝试过，但我认为这会起作用：

def weighted_sample(L, x):

    ans = []        
    c = collections.Counter(L)  

    while len(ans) < x:
        r = random.randint(0, sum(c.values())
        for k in c:
            if r < c[k]:
                ans.append(k)
                del c[k]
                break
            else:
                r -= c[k]
        else:
            # maybe throw an exception since this should never happen on valid input

     return ans

score 0 · Accepted Answer

def sample(self, population, k):
    n = len(population)
    if not 0 <= k <= n:
        raise ValueError("sample larger than population")
    result = [None] * k
    try:
        selected = set()
        selected_add = selected.add
        for i in xrange(k):
            j = int(random.random() * n)
            while j in selected:
                j = int(random.random() * n)
            selected_add(j)
            result[i] = population[j]
    except (TypeError, KeyError):   # handle (at least) sets
        if isinstance(population, list):
            raise
        return self.sample(tuple(population), k)
    return result

上面是示例函数 Lib/random.py 的简化版本。我只删除了一些小数据集的优化代码。代码直接告诉我们如何实现自定义的示例函数：

得到一个随机数
如果该号码之前出现过，请放弃它并获得一个新号码
重复上述步骤，直到获得所需的所有样本编号。

那么真正的问题是如何从一个列表中按权重获取随机值。这可能是random.sample(population, 1)Python 标准库中的原始值（这里有点矫枉过正，但很简单）。

下面是一个实现，因为重复项代表给定列表中的权重，我们可以使用它int(random.random() * array_length)来获取数组的随机索引。

import random
arr = [1, 2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2]

def sample_by_weight( population, k):
    n = len(population)
    if not 0 <= k <= len(set(population)):
        raise ValueError("sample larger than population")
    result = [None] * k
    try:
        selected = set()
        selected_add = selected.add
        for i in xrange(k):
            j = population[int(random.random() * n)]
            while j in selected:
                j = population[int(random.random() * n)]
            selected_add(j)
            result[i] = j
    except (TypeError, KeyError):   # handle (at least) sets
        if isinstance(population, list):
            raise
        return self.sample(tuple(population), k)
    return result

[sample_by_weight(arr,3) for i in range(10)]

score 0 · Accepted Answer

首先，我希望你的清单可能像

[1,2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2]

因此，如果要将给定列表中的排列打印为大小 3，则可以执行以下操作。

import itertools

l = [1,2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2]

for permutation in itertools.permutations(list(set(l)),3):
    print permutation,

输出：

(1, 2, 3) (1, 2, 4) (1, 2, 5) (1, 3, 2) (1, 3, 4) (1, 3, 5) (1, 4, 2) (1, 4, 3) (1, 4, 5) (1, 5, 2) (1, 5, 3) (1, 5, 4) (2, 1, 3) (2, 1, 4) (2, 1, 5) (2, 3, 1) (2, 3, 4) (2, 3, 5) (2, 4, 1) (2, 4, 3) (2, 4, 5) (2, 5, 1) (2, 5, 3) (2, 5, 4) (3, 1, 2) (3, 1, 4) (3, 1, 5) (3, 2, 1) (3, 2, 4) (3, 2, 5) (3, 4, 1) (3, 4, 2) (3, 4, 5) (3, 5, 1) (3, 5, 2) (3, 5, 4) (4, 1, 2) (4, 1, 3) (4, 1, 5) (4, 2, 1) (4, 2, 3) (4, 2, 5) (4, 3, 1) (4, 3, 2) (4, 3, 5) (4, 5, 1) (4, 5, 2) (4, 5, 3) (5, 1, 2) (5, 1, 3) (5, 1, 4) (5, 2, 1) (5, 2, 3) (5, 2, 4) (5, 3, 1) (5, 3, 2) (5, 3, 4) (5, 4, 1) (5, 4, 2) (5, 4, 3)

希望这可以帮助。:)

score 0 · Accepted Answer

>>> from random import shuffle
>>> L = [1, 2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2]
>>> x=3
>>> shuffle(L)
>>> zip(*[L[i::x] for i in range(x)])
[(1, 3, 2), (2, 2, 1), (4, 5, 3), (1, 4, 4)]

您还可以使用生成器表达式而不是列表推导

>>> zip(*(L[i::x] for i in range(x)))
[(1, 3, 2), (2, 2, 1), (4, 5, 3), (1, 4, 4)]

score 0 · Accepted Answer

从一种没有列表竞争的方法开始：

import random
import itertools


alphabet = [1, 2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2]


def alphas():
    while True:
        yield random.choice(alphabet)


def filter_unique(iter):
    found = set()
    for a in iter:
        if a not in found:
            found.add(a)
            yield a


def dice(x):
    while True:
        yield itertools.islice(
            filter_unique(alphas()),
            x
        )

for i, output in enumerate(dice(3)):
    print list(output)
    if i > 10:
        break

列表推导有问题的部分是filter_unique()因为列表推导没有“记忆”它的输出。可能的解决方案是生成许多输出，而没有像@DSM建议的那样找到高质量的输出。

score 0 · Accepted Answer

缓慢而幼稚的方法是：

import random
def pick_n_unique(l, n):
    res = set()
    while len(res) < n:
        res.add(random.choice(l))
    return list(res)

这将选择元素并仅在具有n唯一元素时退出：

>>> pick_n_unique([1, 2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2], 3)
[2, 3, 4]
>>> pick_n_unique([1, 2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2], 3)
[3, 4, 5]

但是，例如，如果您有一个包含 30 1s 和 1的列表，它可能会变慢2，因为一旦它有 a 1，它将继续旋转，直到最终达到 a 2。更好的是计算每个唯一元素的出现次数，选择一个按其出现次数加权的随机元素，从计数列表中删除该元素，然后重复直到获得所需的元素数量：

def weighted_choice(item__counts):
    total_counts = sum(count for item, count in item__counts.items())
    which_count = random.random() * total_counts
    for item, count in item__counts.items():
        which_count -= count
        if which_count < 0:
            return item
    raise ValueError("Should never get here")

def pick_n_unique(items, n):
    item__counts = collections.Counter(items)
    if len(item__counts) < n:
        raise ValueError(
            "Can't pick %d values with only %d unique values" % (
                n, len(item__counts))

    res = []
    for i in xrange(n):
        choice = weighted_choice(item__counts)
        res.append(choice)
        del item__counts[choice]
    return tuple(res)

无论哪种方式，这都是一个不适合列出推导式的问题。

score 0 · Accepted Answer

通过设置：

from random import shuffle
from collections import deque

l = [1, 2, 1, 4, 5, 2, 3, 2, 4, 5, 3, 1, 4, 2]

这段代码：

def getSubLists(l,n):
    shuffle(l) #shuffle l so the elements are in 'random' order
    l = deque(l,len(l)) #create a structure with O(1) insert/pop at both ends
    while l: #while there are still elements to choose
        sample = set() #use a set O(1) to check for duplicates
        while len(sample) < n and l: #until the sample is n long or l is exhausted
            top = l.pop() #get the top value in l
            if top in sample: 
                l.appendleft(top) #add it to the back of l for a later sample
            else:
                sample.add(top) #it isn't in sample already so use it
        yield sample #yield the sample

你最终得到：

for s in getSubLists(l,3):
    print s
>>> 
set([1, 2, 5])
set([1, 2, 3])
set([2, 4, 5])
set([2, 3, 4])
set([1, 4])

python - Python随机列表理解

9 回答 9

Related

Reference