python - Least-Squares Regression of Matrices with Numpy

Question

I'm looking to calculate least squares linear regression from an N by M matrix and a set of known, ground-truth solutions, in a N-1 matrix. From there, I'd like to get the slope, intercept, and residual value of each regression. Basic idea being, I know the actual value of that should be predicted for each sample in a row of N, and I'd like to determine which set of predicted values in a column of M is most accurate using the residuals.

I don't describe matrices well, so here's a drawing:

(N,M) matrix with predicted values for each row N
 in each column of M...

##NOTE: Values of M and N are not actually 4 and 3, just examples
   4 columns in "M"
  [1, 1.1, 0.8, 1.3]
  [2, 1.9, 2.2, 1.7]  3 rows in "N"
  [3, 3.1, 2.8, 3.3]


(1,N) matrix with actual values of N


  [1]
  [2]   Actual value of each sample N, in a single column
  [3]

So again, for clarity's sake, I'm looking to calculate the lstsq regression between each column of the (N,M) matrix and the (1,N) matrix.

For instance, the regression between

[1]   and [1]
[2]       [2]
[3]       [3]

then the regression between

[1]   and  [1.1]
[2]        [1.9]
[3]        [3.1]

and so on, outputting the slope, intercept, and standard error (average residual) for each regression calculated.

So far in the numpy/scipy documentation and around the 'net, I've only found examples computing one column at a time. I had thought numpy had the capability to compute regressions on each column in a set with the standard

np.linalg.lstsq(arrayA,arrayB)

But that returns the error

ValueError: array dimensions must agree except for d_0

Do I need to split the columns into their own arrays, then compute one at a time? Is there a parameter or matrix operation I need to use to have numpy calculate the regressions on each column independently?

I feel like it should be simpler? I've looked it all over, and I can't seem to find anyone doing something similar.

score 2 · Accepted Answer

也许你换了A和B？

以下对我有用：

A=np.random.rand(4)+np.arange(3)[:,None]
# A is now a (3,4) array
b=np.arange(3)
np.linalg.lstsq(A,b)

score 0 · Accepted Answer

arrayB 的第 0 维必须与 arrayA 的第 0 维相同（参考：np.linalg.lstsq 的官方文档）。您需要具有维度的矩阵(N, M) and (N, 1)或(N, M) and (N)代替(N,M) and (1,N)您现在使用的矩阵。

请注意，矩阵(N, 1)和N维度矩阵将给出相同的结果——但数组的形状会有所不同。

我从你那里得到了一个稍微不同的例外，但这可能是由于不同的版本（我在 Windows 上使用 Python 2.7，Numpy 1.6）：

>>> A = np.arange(12).reshape(3, 4)
>>> b = np.arange(3).reshape(1, 3)

>>> np.linalg.lstsq(A,b)
# This gives "LinAlgError: Incompatible dimensions" exception

>>> np.linalg.lstsq(A,b.T)
# This works, note that I am using the transpose of b here

python - Least-Squares Regression of Matrices with Numpy

2 回答 2

Related

Reference