1

我正在尝试在 MATLAB 中实现逻辑回归分类,并坚持使用梯度下降计算正确的权重。

我正在使用随机方法,因此我为每个特征单独更新权重向量中的每个权重,然后移动到下一个样本并再次执行。

我正在使用更新方程

theta_j := theta_j - alpha * (y_i - h_theta(x_i)) * x_ij

当最后一个权重向量与当前权重向量之间的差异小于 0.00005 时,我会中断。我计算两个向量之间的“差”,方法是从另一个向量中减去一个,然后取它们差向量的点积的平方根。

问题是它似乎只在四次更新后就停止了更新,所以我的 8 行权重向量中只有前四个被更新了。无论我的学习率 alpha 是多少,都会发生这种情况。

这是我的实现:

function weightVector = logisticWeightsByGradientDescentStochastic(trueClass,features)
    %% This function attemps to converge on the best set of weights for a logistic regression order 1
    %% Input:
    % trueClass - the training data's vector of true class values
    % features
    %% Output:
    % weightVector - vector of size n+1 (n is number of features)
    % corresponding to convergent weights

    %% Create one vector and append to features
    oneVector = ones( size(features,1) , 1); %create one vector to append to features
    regressData = horzcat(oneVector, features); % create dataset that we will use to calculate regression weights

    %% Get Data Size
    dataSize = size(regressData);

    %% Initial pick for weightVector
    weightVector = rand( dataSize(2), 1); %create a zero vector equal to size of regressData
    weightVector = 100.*weightVector

    %% Choose learning Rate
    learningRate = 1000;

    %% Stochastic Gradient Descent

    oldWeightVector = weightVector; %set oldWeightVector
    newWeightVector = oldWeightVector; % pre-allocate size for newWeightVector
    difference = Inf; %initial difference to get into loop
    iterCount = 0; %for testing to see how long it takes

    while(difference > 0.000005)

        for m=1:dataSize(1) %for all samples

            for n=1:dataSize(2) %for all features

                %% calculate Sigmoid predicted 
                predictedClass = evaluateSigmoid(oldWeightVector, regressData(m,:))


                %% Calculate the error
                error = learningRate .* (trueClass(m) - predictedClass) .* regressData(m,n);

                %% Update weightVector for feature n
                newWeightVector(n) = oldWeightVector(n) - error;

                %% Calculate difference
                vectorDifference = newWeightVector - oldWeightVector; %find difference vector between new and old weight vectors
                difference = sqrt( dot( vectorDifference, vectorDifference)) %calculate the magnitude of difference between new and old weight vectors

                iterCount = iterCount + 1;

                %%Break if difference is below threshold
                if(difference < 0.00005)
                    break
                else
                    oldWeightVector = newWeightVector; % update Old Weight Vector for next prediction
                end
            end %for n

            %%Break if difference is below threshold
            if(difference < 0.000005)
                break
            end   

        end %for m

    end %while difference > 0.0005

    weightVector = newWeightVector
    iterCount
end

我也尝试过使用全局方法而不是随机方法,但它仍然会导致非常大的权重值。

这是我的 evaluateSigmoid 函数

function logisticPrediciton = evaluateSigmoid(weightVector, sample)
    %% This function evaluates the sigmoid with a given weight vector and sample
    %% Input:
    % weightVector - column  vector of n weights
    % sample - row vector sample with n-1 features (a 1 will be appended to the
    % beginning for the constant weight

    sample = transpose(sample); % sample is fed in as a row vector, so must be transposed

    exponent = exp( transpose(weightVector) * sample);

    logisticPrediciton = 1 ./ ( 1 + exponent);


end

这是我正在使用的数据集。最后一列被过滤掉,第一列根据是否满足阈值变为1或0(低于22为0,高于为1)。

4

0 回答 0