我使用来自这里的线性回归代码使用梯度下降创建了一个简单的逻辑回归: Java 中的梯度下降线性回归
现在,我只是通过添加 logit 变换来更改假设,使其适用于 Logistic 回归:1/(1+e^(-z)),其中 z 是原始的 Theta^T * X。而不是按人口缩放尺寸。
当试图测试我的结果时,我得到了一个令人困惑的行为:我将自变量(X)设置为随机数 * 预期权重,将因变量(Y)设置为它们总和的 logit,所以 y= logit(w0*1 + w1*x1 + w2*x2)。
现在在这种情况下,它收敛到正确的答案,我可以恢复预期的权重。但显然我应该将 Y 设为 0 或 1。但是当我向上或向下舍入时,我不再收敛。
我在这里生成训练数据:
@Test
public void testLogisticDescentMultiple() {
//...
//initialize Independent Xi
//Going to create test data y= 10 + .5(x1) + .33(x2)
for( int x=0;x<NUM_EXAMPLES;x++) {
independent.set(x, 0, 1); //x0 We always set this to 1 for the intercept
independent.set(x, 1, random.nextGaussian()); //x1
independent.set(x, 2, random.nextGaussian() ); //x2
}
//initialize dependent Yi
for( int x=0;x<NUM_EXAMPLES;x++) {
double val = w0 + (w1*independent.get(x,1)) + (w2*independent.get(x,2));
double logitVal = logit( val );
//Converges without this code block
if( logitVal < 0.5 ) {
logitVal = 0;
}else {
logitVal = 1;
}
//
dependent.set(x, logitVal );
}
//...
}
public static double logit( double val ) {
return( 1.0 / (1.0 + Math.exp(-val)));
}
//updated Logistic Regression
public DoubleMatrix1D logisticDescent(double alpha,
DoubleMatrix1D thetas,
DoubleMatrix2D independent,
DoubleMatrix1D dependent ) {
Algebra algebra = new Algebra();
//hypothesis is 1/( 1+ e ^ -(theta(Transposed) * X))
//start with theata(Transposed)*X
DoubleMatrix1D hypothesies = algebra.mult( independent, thetas );
//h = 1/(1+ e^-h)
hypothesies.assign(new DoubleFunction() {
@Override
public double apply (double val) {
return( logit( val ) );
}
});
//hypothesis - Y
//Now we have for each Xi, the difference between predicted by the hypothesis and the actual Yi
hypothesies.assign(dependent, Functions.minus);
//Transpose Examples(MxN) to NxM so we can matrix multiply by hypothesis Nx1
DoubleMatrix2D transposed = algebra.transpose(independent);
DoubleMatrix1D deltas = algebra.mult(transposed, hypothesies );
// thetas = thetas - (deltas*alpha) in one step
thetas.assign(deltas, Functions.minusMult(alpha));
return( thetas );
}