python - 如何在 python/scipy 中优化二项式对数似然的参数？

Question

我正在将一些用于估计选择模型参数的 R 代码（不是我的）转换为 Python。对于某些测试数据，我的 Python 版本没有收敛到与 R 版本相同的参数，我不知道为什么。

R 代码定义了一个对数似然函数 (L)，然后使用 nlm() 函数来估计参数：

L <- function(p, y1, m, i1, i0) 
     -sum(dbinom(y1, m, 1/(1 + i0 %*% p/i1 %*% p), log=TRUE)) 

out <- nlm(L, s, y1=y1, m=n, i1=idx1, i0=idx0)

对于一组测试数据，这会产生参数估计：

[1] 0.014302792 0.001703516 0.002347832 0.035365775 0.517465153 0.063503823 0.005776879

在 python 中，我编写了（我认为是）一个等效的对数似然函数（它返回与 R 版本的测试参数相同的值）并尝试使用 scipy.optimize.minimize() 代替 nlm()：

def LL(p, *args):
    y1=args[0]
    m=args[1]
    i1=args[2]
    i0=args[3]

    i0p=np.dot(i0,p)
    i1p=np.dot(i1,p)
    P=1/(1 + np.divide(i0p,i1p))

    # y1 are observed successes in pairwise comparison experiment
    # m the number of trials, P the probability of success in one trial. 
    # I'm fairly sure these inputs are the same in python and R versions

    return -np.sum(stats.binom.logpmf(y1, m, P)) 


out = scipy.optimize.minimize(LL, s, args=(y1,n,idx1,idx0))

但是，在运行时， minimize() 似乎不成功：

out:
  status: 2
  success: False
  njev: 21
  nfev: 201
  hess_inv: array([[1, 0, 0, 0, 0, 0, 0],
   [0, 1, 0, 0, 0, 0, 0],
   [0, 0, 1, 0, 0, 0, 0],
   [0, 0, 0, 1, 0, 0, 0],
   [0, 0, 0, 0, 1, 0, 0],
   [0, 0, 0, 0, 0, 1, 0],
   [0, 0, 0, 0, 0, 0, 1]])
  fun: -273.75549396685
    x: array([ 0.14285714,  0.14285714,  0.14285714,  0.14285714,  0.14285714,
    0.14285714,  0.14285714])
  message: 'Desired error not necessarily achieved due to precision loss.'
  jac: array([  27.99998093, -552.99998856, -500.49999237,  111.99997711,
    671.99995422,  255.49996948,  -14.00000381])

其他方法（例如“Powell”）报告成功，但参数与 R 中的示例相差甚远。

我的问题是：

在其他地方，我看到“由于精度损失，不一定能达到预期的误差。” 是不良似然函数的结果 - 任何人都可以告诉这是这种情况吗？我该如何解决？
我应该尝试其他一些优化方法吗？它们需要将导数传递给 minimise() 方法 - 如何为我的 LL 函数定义梯度（以及必要时的粗麻布）？我看到了一个使用 statsmodel GenericLikelihoodModel 的示例，但对 exog/endog 感到困惑......

python - 如何在 python/scipy 中优化二项式对数似然的参数？

0 回答 0

Related

Reference