1

我需要优化一个非凸问题(最大似然),当我尝试二次优化算法(如 bfgs、Nelder-Mead)时,它无法找到极值,而是经常得到鞍点。

您可以从这里下载数据。

import numpy as np
import csv
from scipy.stats import norm
f=open('data.csv','r')
reader = csv.reader(f)
headers = next(reader)

column={}
for h in headers:
    column[h] = []
for row in reader:
    for h,v in zip(headers, row):
        column[h].append(float(v))

ini=[-0.0002,-0.01,.002,-0.09,-0.04,0.01,-0.02,-.0004]
for i in range(0,len(x[0])):
    ini.append(float(x[0][i])) 
x_header = list(Coef_headers)
N = 19 # no of observations
I = 4
P =7
Yobs=np.zeros(N)
Yobs[:] = column['size']
X=np.zeros((N,P))
X[:,0] = column['costTon']
X[:,1] = column['com1']
X[:,2] = column['com3']
X[:,3] = column['com4']
X[:,4] = column['com5']
X[:,5] = column['night']
X[:,6] = 1 #constant
def myfunction(B):   
    beta = B[0.299,18.495,2.181,2.754,3.59,2.866,-12.846]
    theta = 30    
    U=np.zeros((N,I))
    mm=np.zeros(I)
    u = np.zeros((N,I))
    F = np.zeros((N,I))
    G = np.zeros(N)
    l = 0
    s1 = np.expm1(-theta)
    for n in range (0,N):
        m = 0
        U[n,0] = B[0]*column['cost_van'][n]+ B[4]*column['cap_van'][n]        
        U[n,1] = B[1]+ B[5]*column['ex'][n]+ B[8]*column['dist'][n]+ B[0]*column['cost_t'][n]+ B[4]*column['cap_t'][n]        
        U[n,2] = B[2]+ B[6]*column['ex'][n]+ B[9]*column['dist'][n] + B[0]*column['cost_Ht'][n]+ B[4]*column['cap_Ht'][n]      
        U[n,3] = B[3]+ B[7]*column['ex'][n]+ B[10]*column['dist'][n]+ B[0]*column['cost_tr'][n]+ B[4]*column['cap_tr'][n]
        for i in range(0,I):
            mm[i]=np.exp(U[n,i])
        m= sum(mm)
        for i in range(0,I):                        
            u[n,i]=1/(1+ np.exp(U[n,i]- np.log(m-np.exp(U[n,i]))))
            F[n,i] = np.expm1(-u[n,i]*theta)    
    CDF = np.zeros(N)
    Y =  X.dot(beta)
    resid = 0
    for n in range (0,N):
        resid = resid + (np.square(Yobs[n]-Y[n]))
    SSR = resid / N
    dof = N - P - 1
    s2 = resid/dof  # MSE, or variance: the mean squarred error of residuals
    for n in range(0,N):   
        CDF[n] = norm.cdf((Yobs[n]+1),SSR,s2) - norm.cdf((Yobs[n]-1),SSR,s2)
        G[n] = np.expm1(-CDF[n]*theta)
        k = column['Choice_Veh'][n]-1
        l = l + (np.log10(1+(F[n,k]*G[n]/s1))/(-theta))
    loglikelihood = np.log10(l) 
    return -loglikelihood

rranges = np.repeat(slice(-10, 10, 1),11, axis = 0)
a = rranges
from scipy import optimize
resbrute = optimize.brute(myfunction, rranges, full_output=True,finish=optimize.fmin)
print("# global minimum:", resbrute[0])
print("function value at global minimum :", resbrute[1])    

现在,我决定进行网格搜索并尝试了 scipy.optimize.brute,但我收到了这个错误。事实上,我的真实变量是 47,我把它减少到 31 来工作,但仍然没有。请帮忙。

File "C:\...\site-packages\numpy\core\numeric.py", line 1906, in indices
res = empty((N,)+dimensions, dtype=dtype)

ValueError: array is too big.
4

0 回答 0