我正在尝试在 MATLAB 中实现逻辑回归分类,并坚持使用梯度下降计算正确的权重。
我正在使用随机方法,因此我为每个特征单独更新权重向量中的每个权重,然后移动到下一个样本并再次执行。
我正在使用更新方程
theta_j := theta_j - alpha * (y_i - h_theta(x_i)) * x_ij
当最后一个权重向量与当前权重向量之间的差异小于 0.00005 时,我会中断。我计算两个向量之间的“差”,方法是从另一个向量中减去一个,然后取它们差向量的点积的平方根。
问题是它似乎只在四次更新后就停止了更新,所以我的 8 行权重向量中只有前四个被更新了。无论我的学习率 alpha 是多少,都会发生这种情况。
这是我的实现:
function weightVector = logisticWeightsByGradientDescentStochastic(trueClass,features)
%% This function attemps to converge on the best set of weights for a logistic regression order 1
%% Input:
% trueClass - the training data's vector of true class values
% features
%% Output:
% weightVector - vector of size n+1 (n is number of features)
% corresponding to convergent weights
%% Create one vector and append to features
oneVector = ones( size(features,1) , 1); %create one vector to append to features
regressData = horzcat(oneVector, features); % create dataset that we will use to calculate regression weights
%% Get Data Size
dataSize = size(regressData);
%% Initial pick for weightVector
weightVector = rand( dataSize(2), 1); %create a zero vector equal to size of regressData
weightVector = 100.*weightVector
%% Choose learning Rate
learningRate = 1000;
%% Stochastic Gradient Descent
oldWeightVector = weightVector; %set oldWeightVector
newWeightVector = oldWeightVector; % pre-allocate size for newWeightVector
difference = Inf; %initial difference to get into loop
iterCount = 0; %for testing to see how long it takes
while(difference > 0.000005)
for m=1:dataSize(1) %for all samples
for n=1:dataSize(2) %for all features
%% calculate Sigmoid predicted
predictedClass = evaluateSigmoid(oldWeightVector, regressData(m,:))
%% Calculate the error
error = learningRate .* (trueClass(m) - predictedClass) .* regressData(m,n);
%% Update weightVector for feature n
newWeightVector(n) = oldWeightVector(n) - error;
%% Calculate difference
vectorDifference = newWeightVector - oldWeightVector; %find difference vector between new and old weight vectors
difference = sqrt( dot( vectorDifference, vectorDifference)) %calculate the magnitude of difference between new and old weight vectors
iterCount = iterCount + 1;
%%Break if difference is below threshold
if(difference < 0.00005)
break
else
oldWeightVector = newWeightVector; % update Old Weight Vector for next prediction
end
end %for n
%%Break if difference is below threshold
if(difference < 0.000005)
break
end
end %for m
end %while difference > 0.0005
weightVector = newWeightVector
iterCount
end
我也尝试过使用全局方法而不是随机方法,但它仍然会导致非常大的权重值。
这是我的 evaluateSigmoid 函数
function logisticPrediciton = evaluateSigmoid(weightVector, sample)
%% This function evaluates the sigmoid with a given weight vector and sample
%% Input:
% weightVector - column vector of n weights
% sample - row vector sample with n-1 features (a 1 will be appended to the
% beginning for the constant weight
sample = transpose(sample); % sample is fed in as a row vector, so must be transposed
exponent = exp( transpose(weightVector) * sample);
logisticPrediciton = 1 ./ ( 1 + exponent);
end
这是我正在使用的数据集。最后一列被过滤掉,第一列根据是否满足阈值变为1或0(低于22为0,高于为1)。