python - scipy.stats 种子？

Question

我正在尝试使用不同的种子生成 scipy.stats.pareto.rvs(b, loc=0, scale=1, size=1) 。

在 numpy 中，我们可以使用 numpy.random.seed(seed=233423) 播种。

有什么方法可以播种 scipy stats 生成的随机数。

注意：我没有使用 numpy pareto，因为我想给出不同的比例值。

score 53 · Accepted Answer

scipy.stats只是用来numpy.random生成它的随机数，所以numpy.random.seed()也可以在这里工作。例如，

import numpy as np
from scipy.stats import pareto
b = 0.9
np.random.seed(seed=233423)
print pareto.rvs(b, loc=0, scale=1, size=5)
np.random.seed(seed=233423)
print pareto.rvs(b, loc=0, scale=1, size=5)

将打印[ 9.7758784 10.78405752 4.19704602 1.19256849 1.02750628]两次。

score 21 · Accepted Answer

对于那些在 7 年后偶然发现这个问题的人来说，numpy 随机状态生成器函数发生了重大变化。根据此处和此处的文档，RandomState该类已替换为Generator该类。RandomState保证与旧版本/代码兼容，但不会收到任何实质性更改，包括为Generator.

为了说明如何在同一实验中将现有的基于 Numpy 的随机流传递给 Scipy 函数，下面给出了一些示例和推理，哪些情况是可取的以及为什么。

from numpy.random import Generator, PCG64
from scipy.stats import binom

n, p, size, seed = 10, 0.5, 10, 12345

# Case 1 : Scipy uses some default Random Generator
numpy_randomGen = Generator(PCG64(seed))
scipy_randomGen = binom
print(scipy_randomGen.rvs(n, p, size))
print(numpy_randomGen.binomial(n, p, size))
# prints
# [6 6 5 4 6 6 8 6 6 4]
# [4 4 6 6 5 4 5 4 6 7]
# NOT DESIRABLE as we don't have control over the seed of Scipy random number generation


# Case 2 : Scipy uses same seed and Random generator (new object though)
scipy_randomGen.random_state=Generator(PCG64(seed))
numpy_randomGen = Generator(PCG64(seed))
print(scipy_randomGen.rvs(n, p, size))
print(numpy_randomGen.binomial(n, p, size))
# prints
# [4 4 6 6 5 4 5 4 6 7]
# [4 4 6 6 5 4 5 4 6 7]
    # This experiment is using same sequence of random numbers, one is being used by Scipy
# and other by Numpy. NOT DESIRABLE as we don't want repetition of some random 
# stream in same experiment.


# Case 3 (IMP) : Scipy uses an existing Random Generator which can being passed to Scipy based 
# random generator object
numpy_randomGen = Generator(PCG64(seed))
scipy_randomGen.random_state=numpy_randomGen
print(scipy_randomGen.rvs(n, p, size))
print(numpy_randomGen.binomial(n, p, size))
# prints
# [4 4 6 6 5 4 5 4 6 7]
# [4 8 6 3 5 7 6 4 6 4]
# This should be the case which we mostly want (DESIRABLE). If we are using both Numpy based and 
#Scipy based random number generators/function, then not only do we have no repetition of 
#random number sequences but also have reproducibility of results in this case.

score 20 · Accepted Answer

对于四年后看到这篇文章的人，Scipy 确实提供了一种将np.random.RandomState对象传递给其随机变量类的方法，请参阅rv_continuous和rv_discrete了解更多详细信息。scipy 文档是这样说的：

种子：无或 int 或 numpy.random.RandomState 实例，可选

此参数定义用于绘制随机变量的 RandomState 对象。如果没有（或 np.random），则使用全局 np.random 状态。如果是整数，则用于播种本地 RandomState 实例。默认为无。

不幸的是，在连续/离散 rvs 子类rv_continuous或rv_discrete. 但是，该random_state属性确实属于子类，这意味着我们可以使用np.random.RandomStateafter 实例化的实例来设置种子，如下所示：

import numpy as np
import scipy.stats as stats

alpha_rv = stats.alpha(3.57)
alpha_rv.random_state = np.random.RandomState(seed=342423)

score 7 · Accepted Answer

除了 user5915738 的答案之外，我认为这通常是最好的答案，我想指出最方便的方式来播种scipy.stats分布的随机生成器。

您可以在使用该rvs方法生成分布时设置种子，或者通过将种子定义为整数，该整数用于np.random.RandomState内部种子：

uni_int_seed = scipy.stats.uniform(-.1, 1.).rvs(10, random_state=12)

或直接定义np.random.RandomState：

uni_state_seed = scipy.stats.uniform(-.1, 1.).rvs(
    10, random_state=np.random.RandomState(seed=12))

两种方法是等效的：

np.all(uni_int_seed == uni_state_seed)
# Out: True

与将其分配给or相比，此方法的优势random_state在于，您始终可以显式控制的随机状态，而在每次调用之后，种子都会丢失，在失去对分布的跟踪时可能会导致不可重现的结果. 同样根据The Zen of Python：rv_continuousrv_discretervsmy_dist.random_state = np.random.RandomState(seed=342423)rvs

显式优于隐式。

:)

python - scipy.stats 种子？

4 回答 4

Related

Reference