0

我正在尝试编写自己的逻辑回归,并比较最大化对数似然的不同方法。使用 Newton-CG 方法,我收到错误消息“ValueError: setting an array element with a sequence”。仔细阅读,如果试图最小化的函数返回非 skalar,似乎这个错误会上升,但这里不是这种情况。我需要下面给出的三种方法来给出相同的结果(大约),但是在我的真实数据上运行时,一种不收敛,另一种给出的 LL 比最初的猜测更差,第三种根本没有运行.

为什么我会收到 ValueError 消息,我该如何解决?

我的代码(使用虚拟数据,真实数据约为 100 次测量)如下:

import numpy as np
from numpy import linalg
import scipy
from scipy.optimize import minimize
def CalcLL(beta,xinlist,yinlist):
    LL=0.0
    ncol=len(beta)
    pi=FindPi(xinlist,beta.reshape(ncol,1))
    for i in range(len(yinlist)):
        LL=LL+np.where(yinlist[i]==1,np.log(pi[i]),np.log(1-pi[i]))
    return -LL
def Jacobian(beta,xinlist,yinlist):
    ncol=len(beta)
    nrow=np.shape(xinlist)[0]
    pi=FindPi(xinlist,beta.reshape(ncol,1))
    Jac=np.transpose(np.matrix(yinlist-pi))*np.matrix(xinlist)
    return Jac
def Hessian(beta,xinlist,yinlist):
    ncol=len(beta)
    nrow=np.shape(xinlist)[0]
    pi=FindPi(xinlist,beta.reshape(ncol,1))
    W=FindW(pi)
    Hes=np.matrix(np.transpose(xinlist))*(np.matrix(W)*np.matrix(xinlist))
    return Hes
def FindPi(xinlist,beta):
    rows=np.shape(xinlist)[0]# Number of rows in x_new
    cols=np.shape(xinlist)[1]# Number of columns in x_new
    expon=np.dot(xinlist,beta)
    expon=np.array(expon).reshape(rows,1)
    pi=np.exp(expon)/(1+np.exp(expon))
    return pi
def FindW(pi):
    W=np.zeros(len(pi)*len(pi)).reshape(len(pi),len(pi))
    for i in range(len(pi)):
        W[i,i]=float(pi[i]*(1-pi[i]))
    return W

xinlist=np.matrix([[1,1],[0,1],[1,1],[1,1],[1,1],[0,1],[0,1],[1,1],[1,1],[0,1]])
yinlist=np.transpose(np.matrix([0,0,0,0,0,1,1,1,1,1]))

ncol=np.shape(xinlist)[1]

beta1=np.zeros(ncol).reshape(ncol,1) # Initial guess for parameter values
limit=0.000001 # selfwritten Newton-Raphson method
iter_i=limit+1
while iter_i>limit:
    Hes=Hessian(beta1,xinlist,yinlist)
    Jac=np.transpose(Jacobian(beta1,xinlist,yinlist))
    root_diff=np.array(linalg.inv(Hes)*Jac)
    beta1=beta1+root_diff
    iter_i=np.sum(root_diff*root_diff)
print "When running self-written algorithm, the log-likelihood is",-CalcLL(beta1,xinlist,yinlist)

beta2=np.zeros(ncol).reshape(ncol,1)
res=minimize(CalcLL,beta2,args=(xinlist,yinlist),method='Nelder-Mead',options={'xtol':1e-8,'disp':True,'maxiter':10000})
beta2=res.x
print "The log-likelihood using Nelder-Mead is", -CalcLL(beta2,xinlist,yinlist)

beta3=np.zeros(ncol).reshape(ncol,1)
res=minimize(CalcLL,beta3,args=(xinlist,yinlist),method='Newton-CG',jac=Jacobian,hess=Hes,options={'xtol':1e-8,'disp':True})
beta3=res.x
print "The log-likelihood using Newton-CG is", -CalcLL(beta3,xinlist,yinlist)

编辑:错误堆栈如下: Traceback(最近一次调用):

文件“MyLogisticRegression2.py”,第 62 行,在 res=minimize(CalcLL,beta3,args=(xinlist,yinlist),method='Newton-CG',jac=Jacobian,hess=Hes,options={'xtol': 1e-8,'disp':真})

文件 C:\Python27\lib\site-packages\scipy\optimize_minimize.py,第 447 行,在最小化**选项中)

文件 C:\Python27\lib\site-packages\scipy\optimize\optimize.py,第 2393 行,在 _minimize_newtoncg eta=numpy.min([0.5, numpy.sqrt(maggrad)])

文件 C:\Python27\lib\site-packages\numpy\core\fromnumeric.py,第 2393 行,in amin out=out,**kwargs)

文件 C:\Python27\lib\site-packages\numpy\core_methods.py,第 29 行,在 _amin 中返回 umr_minimum(a,axis,None,out,keepdims)

ValueError:使用序列设置数组元素

4

1 回答 1

0

我发现问题来自具有形状 (2,1) 而不是 (2,) 的 beta 数组,雅可比矩阵也是如此。改造这两个解决了这个问题。

Newton-CG 求解器显然只需要雅可比矩阵的一维数组。

于 2017-07-31T14:23:49.967 回答