python - KS 测试的非标准分布变量？

Question

您能否将 scipy.stats 中的 kstest 用于非标准分布函数（即改变学生 t 的 DOF，或改变 Cauchy 的 gamma）？我的最终目标是找到适合我的分布的最大 p 值和相应参数，但这不是问题。

编辑：

"

scipy.stat 的 cauchy pdf 是：

cauchy.pdf(x) = 1 / (pi * (1 + x**2))

它意味着x_0 = 0位置参数和伽玛，Y = 1。我实际上需要它看起来像这样

cauchy.pdf(x, x_0, Y) = Y**2 / [(Y * pi) * ((x - x_0)**2 + Y**2)]

"

Q1) 学生 t 至少可以以某种方式使用吗？

stuff = []
for dof in xrange(0,100):
    d, p, dof = scipy.stats.kstest(data, "t", args = (dof, ))
    stuff.append(np.hstack((d, p, dof)))

因为它似乎可以选择改变参数？

Q2）如果你需要完整的正态分布方程（需要改变 sigma）和上面写的 Cauchy（需要改变 gamma），你会怎么做？编辑：不是搜索scipy.stats非标准分布，实际上是否可以将我编写的函数输入到 kstest 中以找到 p 值？

谢谢

score 1 · Accepted Answer

看来你真正想做的是参数估计。以这种方式使用 KT 检验并不是它的真正含义。您应该使用相应的分配.fit方法。

>>> import numpy as np, scipy.stats as stats
>>> arr = stats.norm.rvs(loc=10, scale=3, size=10) # generate 10 random samples from a normal distribution
>>> arr
array([ 11.54239861,  15.76348509,  12.65427353,  13.32551871,
        10.5756376 ,   7.98128118,  14.39058752,  15.08548683,
         9.21976924,  13.1020294 ])
>>> stats.norm.fit(arr)
(12.364046769964004, 2.3998164726918607)
>>> stats.cauchy.fit(arr)
(12.921113834451496, 1.5012714431045815)

现在快速检查文档：

>>> help(cauchy.fit)

Help on method fit in module scipy.stats._distn_infrastructure:

fit(data, *args, **kwds) method of scipy.stats._continuous_distns.cauchy_gen instance
    Return MLEs for shape, location, and scale parameters from data.

    MLE stands for Maximum Likelihood Estimate.  Starting estimates for
    the fit are given by input arguments; for any arguments not provided
    with starting estimates, ``self._fitstart(data)`` is called to generate
    such.

    One can hold some parameters fixed to specific values by passing in
    keyword arguments ``f0``, ``f1``, ..., ``fn`` (for shape parameters)
    and ``floc`` and ``fscale`` (for location and scale parameters,
    respectively).

...

Returns
-------
shape, loc, scale : tuple of floats
    MLEs for any shape statistics, followed by those for location and
    scale.

Notes
-----
This fit is computed by maximizing a log-likelihood function, with
penalty applied for samples outside of range of the distribution. The
returned answer is not guaranteed to be the globally optimal MLE, it
may only be locally optimal, or the optimization may fail altogether.

因此，假设我想保持其中一个参数不变，您可以轻松做到：

>>> stats.cauchy.fit(arr, floc=10)
(10, 2.4905786982353786)
>>> stats.norm.fit(arr, floc=10)
(10, 3.3686549590571668)

python - KS 测试的非标准分布变量？

1 回答 1

Related

Reference