1

Let there be the following definition of gradient descent cost function

enter image description here

with the hypothesis function defined as

enter image description here

what I've come up with for multivariate linear regression is

theta = theta - alpha * 1/m * ([theta', -1]*[X';y']*X)';
h_theta = 1/(2*m)* (X*theta - y)'*(X*theta-y);

(octave notation, ' means matrix transpose, [A, n] means adding a new column to matrix A with scalar value n, [A; B] means appending matrix B to matrix A row-wise)

It's doing its job correctly how far I can tell (the plots look ok), however I have a strong feeling that it's unnecessarily complicated.

How to write it with as little matrix operations as possible (and no element-wise operations, of course)?

4

2 回答 2

1

我不认为这是不必要的复杂,而是这就是你想要的。矩阵运算很好,因为您不必自己循环元素或进行元素运算。我记得在网上上过一门课程,我的解决方案似乎很相似。

于 2013-05-06T04:34:36.213 回答
0

你拥有它的方式是最有效的方式,因为它是完全矢量化的。可以通过对求和等进行 for 循环来完成,但这在处理能力方面非常低效。

于 2013-05-07T04:33:42.927 回答