python - 从有限制的列表中使用随机选择时解决 ValueError

Question

给定一个随机列表和一个限制列表：

>>> import numpy as np
>>> x = np.random.randint(0, 50, 10)
>>> x
array([27, 14, 42,  1,  9, 43, 16, 39, 27,  3])
>>> y = [1,2,5, 19, 27]
>>> n = 5

我想在没有 Y 值的情况下从 X 中采样（不替换）N，我可以这样做：

>>> np.random.choice(list(set(x).difference(y)), n, replace=False)
array([39,  9, 43, 14, 16])

N 是一个用户输入，肯定小于，len(x)但鉴于我不知道 N 是否大于 XY 的子集，我可能会遇到这种情况，抛出一个ValueError：

>>> np.random.choice(list(set(x).difference(y)), 8, replace=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mtrand.pyx", line 1150, in mtrand.RandomState.choice (numpy/random/mtrand/mtrand.c:18113)
ValueError: Cannot take a larger sample than population when 'replace=False'

鉴于我必须为 N 预设一个最大值，例如：

>>> n = min(n, len(list(set(x).difference(y))) )
>>> n
7

但在这种情况下，N 不再是 8，用户输入：

>>> np.random.choice(list(set(x).difference(y)), n, replace=False)
array([14, 39, 43,  3, 42,  9, 16])

所以我必须后添加输出：

>>> list(np.random.choice(list(set(x).difference(y)), _n, replace=False)) + [-1]*(n-_n)
[43, 42, 9, 16, 3, 39, 14, -1]

总而言之，我必须抽样 N 号。没有从不在 Y 中的 X 值的子集替换的元素，如果子集的长度小于 N，我需要用 -1 填充“间隙”。

我可以用上面的代码来做，但是有没有更简洁（希望也更有效）的方法来实现相同的输出？

score 1 · Accepted Answer

我可能会使用np.in1d差异，然后np.append进行后期处理：

x = np.random.randint(0, 50, 10)
y = [1, 2, 5, 19, 27]
n = 12    

x_y = x[~np.in1d(x,y)]                                                    
arr = np.append(np.random.choice(x_y, len(x_y), replace=False), [-1]*(n-len(x_y)))
print arr
# array([35, 46, 39, 21,  9, 37, 17, 23,  8, -1, -1, -1])

如果n小于差异的长度，则不附加任何内容。

python - 从有限制的列表中使用随机选择时解决 ValueError

1 回答 1

Related

Reference