machine-learning - 使用梯度下降的 SVM - 公式

Question

在使用梯度下降进行线性 SVM（支持向量机）时，我遇到了一些困难。

我使用的公式如下所示。

在此处输入图像描述

其中第一个方程是成本函数，第二个方程是每个特征的 theta 值。

c是对拟合参数（Regularization parameter）的控制

alpha 决定斜率收敛的速率。

不知何故，当我为我的数据集运行上述公式时，我的 j(theta) 不断增加，它永远不会减少。我通过更改 c 和 alpha 的值尝试了所有可能的情况。

如果公式中可能有任何错误，如果有人能指出，我会很高兴。

这是我正在使用的八度代码：

clear all                                                                                                                    
clc

x=[3,1;2,2;1,2;1.5,3;4,1;4,2;4,3;4,5];

y=[1;1;1;1;0;0;0;0];

[m,n]=size(x);
x=[ones(m,1),x];

X=x;

hold off
%. In this step we will plot the graph for the given input data set just to see how is the distribution of the two class.
pos = find(y == 1);  % This will take the postion or array number from y for all the class that has value 1 
neg = find(y == 0);  % Similarly this will take the position or array number from y for all class that has value 0
 % Now we plot the graph column x1 Vs x2 for y=1 and y=0


hold on  
plot(X(pos, 2), X(pos,3), '+');                                                      
plot(X(neg, 2), X(neg, 3), 'o');
axis([min(x(:,2))-2,max(x(:,2))+2, min(x(:,3))-2, max(x(:,3))+2])
xlabel('x1 marks in subject 1')
ylabel('y1 marks in subject 2')
legend('diagonal','failed', 'Pass')
hold off


% feature scaling
%   Now we limit the x1 and x2 we need to leave or skip the first column x0 because they should stay as 1.
% If we dont do feature scalling then the decision line would be opposite.
%mn = mean(x);
%sd = std(x);
%x(:,2) = (x(:,2) - mn(2))./ sd(2);
%x(:,3) = (x(:,3) - mn(3))./ sd(3);




% Algorith for linear SVM
g=inline('1.0 ./ (1.0 + exp(-z))');
theta = zeros(size(x(1,:)))'; 
max_iter=100;
j_theta=zeros(max_iter,1);            % j is a zero matrix that is used to store the theta cost function j(theta)
c=0.1;    
alpha=0.1;
for num_iter =1:max_iter
z=x*theta;
h=g(z);
h
j_theta(num_iter)=c .* (-y'* log(h) - (1 - y)'*log(1-h)) + ((0.5) * (theta'*theta));  % the second term is regularization
%% the above equation computes the cost function

    grad = (c^2) * x' * (h-y);   %% computed the gradient descent 
    reg_exprson= alpha .* (0.5) * (theta'*theta);  %% Computes the regularization term
    theta=theta - (alpha.*grad) - reg_exprson ;  %% Computes the new theta vector for each feature
    theta  

end   

    figure
    plot(0:99, j_theta(1:100), 'b', 'LineWidth', 2)

谢谢

machine-learning - 使用梯度下降的 SVM - 公式

0 回答 0

Related

Reference