python - fminunc 在 numpy 中交替

Question

fminuncpython中的函数（来自octave/matlab）是否有替代方法？我有一个二元分类器的成本函数。现在我想运行梯度下降来获得 theta 的最小值。octave/matlab 实现将如下所示。

%  Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);

%  Run fminunc to obtain the optimal theta
%  This function will return theta and the cost 
[theta, cost] = ...
    fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

我已经使用 numpy 库在 python 中转换了我的 costFunction，并在 numpy 中寻找 fminunc 或任何其他梯度下降算法实现。

score 32 · Accepted Answer

这里有更多关于感兴趣的功能的信息：http: //docs.scipy.org/doc/scipy-0.10.0/reference/tutorial/optimize.html

此外，看起来您正在学习 Coursera 机器学习课程，但使用的是 Python。您可以查看http://aimotion.blogspot.com/2011/11/machine-learning-with-python-logistic.html；这家伙也在做同样的事情。

score 24 · Accepted Answer

我还尝试实现 Coursera ML 课程中讨论的逻辑回归，但在 python 中。我发现 scipy 很有帮助。在最小化函数中尝试了不同的算法实现后，我发现牛顿共轭梯度最有帮助。同样经过检查它的返回值，似乎它相当于Octave中的fminunc。我在下面的python中包含了我的实现以找到最佳theta。

import numpy as np
import scipy.optimize as op

def Sigmoid(z):
    return 1/(1 + np.exp(-z));

def Gradient(theta,x,y):
    m , n = x.shape
    theta = theta.reshape((n,1));
    y = y.reshape((m,1))
    sigmoid_x_theta = Sigmoid(x.dot(theta));
    grad = ((x.T).dot(sigmoid_x_theta-y))/m;
    return grad.flatten();

def CostFunc(theta,x,y):
    m,n = x.shape; 
    theta = theta.reshape((n,1));
    y = y.reshape((m,1));
    term1 = np.log(Sigmoid(x.dot(theta)));
    term2 = np.log(1-Sigmoid(x.dot(theta)));
    term1 = term1.reshape((m,1))
    term2 = term2.reshape((m,1))
    term = y * term1 + (1 - y) * term2;
    J = -((np.sum(term))/m);
    return J;

# intialize X and y
X = np.array([[1,2,3],[1,3,4]]);
y = np.array([[1],[0]]);

m , n = X.shape;
initial_theta = np.zeros(n);
Result = op.minimize(fun = CostFunc, 
                                 x0 = initial_theta, 
                                 args = (X, y),
                                 method = 'TNC',
                                 jac = Gradient);
optimal_theta = Result.x;

score 8 · Accepted Answer

看来您必须更改为scipy.

在那里你会发现所有基本的优化算法都很容易实现。

http://docs.scipy.org/doc/scipy/reference/optimize.html

score 2 · Accepted Answer

实现如下并获得类似的 octiva 结果：

                        import pandas as pd
                        import numpy as np
                        import matplotlib.pyplot as plt
                        import seaborn as sns
                        %matplotlib inline
                        filepath =('C:/Pythontry/MachineLearning/dataset/couresra/ex2data1.txt')
                        data =pd.read_csv(filepath,sep=',',header=None)
                        #print(data)
                        X = data.values[:,:2]  #(100,2)
                        y = data.values[:,2:3] #(100,1)
                        #print(np.shape(y))
                        #In 2
                        #%% ==================== Part 1: Plotting ====================
                        postive_value = data.loc[data[2] == 1]
                        #print(postive_value.values[:,2:3])
                        negative_value = data.loc[data[2] == 0]
                        #print(len(postive_value))
                        #print(len(negative_value))
                        ax1 = postive_value.plot(kind='scatter',x=0,y=1,s=50,color='b',marker="+",label="Admitted") # S is line width #https://matplotlib.org/api/_as_gen/matplotlib.axes.Axes.scatter.html#matplotlib.axes.Axes.scatter 
                        ax2 = negative_value.plot(kind='scatter',x=0,y=1,s=50,color='y',ax=ax1,label="Not Admitted")
                        ax1.set_xlabel("Exam 1 score")
                        ax2.set_ylabel("Exam 2 score")
                        plt.show()
                        #print(ax1 == ax2)
                        #print(np.shape(X))

                # In 3
                        #============ Part 2: Compute Cost and Gradient ===========
                        [m,n] = np.shape(X) #(100,2)
                        print(m,n)
                        additional_coulmn = np.ones((m,1))
                        X = np.append(additional_coulmn,X,axis=1)
                        initial_theta = np.zeros((n+1), dtype=int)
                        print(initial_theta)

                        # In4
                        #Sigmoid and cost function
                        def sigmoid(z):
                            g = np.zeros(np.shape(z));
                            g = 1/(1+np.exp(-z));
                            return g
                        def costFunction(theta, X, y):
                               J = 0;
                               #print(theta)
                               receive_theta = np.array(theta)[np.newaxis] ##This command is used to create the 1D array 
                               #print(receive_theta)
                               theta = np.transpose(receive_theta)
                               #print(np.shape(theta))       
                               #grad = np.zeros(np.shape(theta))
                               z = np.dot(X,theta) # where z = theta*X
                               #print(z)
                               h = sigmoid(z) #formula h(x) = g(z) whether g = 1/1+e(-z) #(100,1)
                               #print(np.shape(h))
                               #J = np.sum(((-y)*np.log(h)-(1-y)*np.log(1-h))/m); 
                               J = np.sum(np.dot((-y.T),np.log(h))-np.dot((1-y).T,np.log(1-h)))/m
                               #J = (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()
                               #error = h-y
                               #print(np.shape(error))
                               #print(np.shape(X))
                               grad =np.dot(X.T,(h-y))/m
                               #print(grad)
                               return J,grad
            #In5
                        [cost, grad] = costFunction(initial_theta, X, y)
                        print('Cost at initial theta (zeros):', cost)
                        print('Expected cost (approx): 0.693\n')
                        print('Gradient at initial theta (zeros): \n',grad)
                        print('Expected gradients (approx):\n -0.1000\n -12.0092\n -11.2628\n')

            In6 # Compute and display cost and gradient with non-zero theta
            test_theta = [-24, 0.2, 0.2]
            #test_theta_value = np.array([-24, 0.2, 0.2])[np.newaxis]  #This command is used to create the 1D row array 

            #test_theta = np.transpose(test_theta_value) # Transpose 
            #test_theta = test_theta_value.transpose()
            [cost, grad] = costFunction(test_theta, X, y)

            print('\nCost at test theta: \n', cost)
            print('Expected cost (approx): 0.218\n')
            print('Gradient at test theta: \n',grad);
            print('Expected gradients (approx):\n 0.043\n 2.566\n 2.647\n')

#IN6
    # ============= Part 3: Optimizing using range  =============
    import scipy.optimize as opt
    #initial_theta_initialize = np.array([0, 0, 0])[np.newaxis]
    #initial_theta = np.transpose(initial_theta_initialize)
    print ('Executing minimize function...\n')
    # Working models
    #result = opt.minimize(costFunction,initial_theta,args=(X,y),method='TNC',jac=True,options={'maxiter':400})
    result = opt.fmin_tnc(func=costFunction, x0=initial_theta, args=(X, y))
    # Not working model
    #costFunction(initial_theta,X,y)
    #model = opt.minimize(fun = costFunction, x0 = initial_theta, args = (X, y), method = 'TNC',jac = costFunction)
    print('Thetas found by fmin_tnc function: ', result);
    print('Cost at theta found : \n', cost);
    print('Expected cost (approx): 0.203\n');
    print('theta: \n',result[0]);
    print('Expected theta (approx):\n');
    print(' -25.161\n 0.206\n 0.201\n');

输出：执行最小化功能...

fmin_tnc 函数找到的 Thetas：(array([-25.16131854, 0.20623159, 0.20147149]), 36, 0) 找到的 theta 成本：0.218330193827 预期成本（大约）：0.203

θ：[-25.16131854 0.20623159 0.20147149] 预期θ（大约）：

-25.161 0.206 0.201

score 0 · Accepted Answer

谢谢！这段代码帮助我了解了 scipy 优化的工作原理。我相信在“不工作模型”中，您应该将成本和梯度函数分开，如示例SciPy 使用梯度最小化并根据文档https://docs.scipy.org/doc/scipy/中的 jac 字段描述参考/生成/scipy.optimize.minimize.html#scipy.optimize.minimize

python - fminunc 在 numpy 中交替

5 回答 5

Related

Reference