Let there be the following definition of gradient descent cost function
with the hypothesis function defined as
what I've come up with for multivariate linear regression is
theta = theta - alpha * 1/m * ([theta', -1]*[X';y']*X)';
h_theta = 1/(2*m)* (X*theta - y)'*(X*theta-y);
(octave notation, '
means matrix transpose, [A, n]
means adding a new column to matrix A with scalar value n, [A; B]
means appending matrix B to matrix A row-wise)
It's doing its job correctly how far I can tell (the plots look ok), however I have a strong feeling that it's unnecessarily complicated.
How to write it with as little matrix operations as possible (and no element-wise operations, of course)?