matlab - Matlab主成分回归（pcr）分析中的常数项

Question

我正在尝试使用 Matlab 学习主成分回归（pcr）。我在这里使用本指南：http: //www.mathworks.fr/help/stats/examples/partial-least-squares-regression-and-principal-components-regression.html

这真的很好，但我只是无法理解一个步骤：

我们做 PCA 和回归，很好很清楚：

[PCALoadings,PCAScores,PCAVar] = princomp(X);
betaPCR = regress(y-mean(y), PCAScores(:,1:2));

然后我们调整第一个系数：

betaPCR = PCALoadings(:,1:2)*betaPCR;
betaPCR = [mean(y) - mean(X)*betaPCR; betaPCR];
yfitPCR = [ones(n,1) X]*betaPCR;

为什么系数需要'mean(y) - mean(X)*betaPCR'为常数一个因素？你能给我解释一下吗？

提前致谢！

score 5 · Accepted Answer

This is really a math question, not a coding question. Your PCA extracts a set of features and puts them in a matrix, which gives you PCALoadings and PCAScores. Pull out the first two principal components and their loadings, and put them in their own matrix:

W = PCALoadings(:, 1:2)
Z = PCAScores(:, 1:2)

The relationship between X and Z is that X can be approximated by:

Z = (X - mean(X)) * W      <=>      X ~ mean(X) + Z * W'                  (1)

The intuition is that Z captures most of the "important information" in X, and the matrix W tells you how to transform between the two representations.

Now you can do a regression of y on Z. First you have to subtract the mean from y, so that both the left and right hand sides have mean zero:

y - mean(y) = Z * beta + errors                                           (2)

Now you want to use that regression to make predictions for y from X. Substituting from equation (1) into equation (2) gives you

y - mean(y) = (X - mean(X)) * W * beta

            = (X - mean(X)) * beta1

where we have defined beta1 = W * beta (you do this in your third line of code). Rearranging:

y = mean(y) - mean(X) * beta1 + X * beta1

  = [ones(n,1) X] * [mean(y) - mean(X) * beta1; beta1]

  = [ones(n,1) X] * betaPCR

which works out if we define

betaPCR = [mean(y) - mean(X) * beta1; beta1]

as in your fourth line of code.

matlab - Matlab主成分回归（pcr）分析中的常数项

1 回答 1

Related

Reference