matlab - [Octave]使用 fminunc 并不总是给出一致的解决方案

Question

我试图在方程中找到系数来模拟电机的阶跃响应，其形式为1-e^x。我用来建模的方程式是

a(1)*t^2 + a(2)*t^3 + a(3)*t^3 + ...

（它来源于一篇用于求解电机参数的研究论文）

有时使用fminunc来查找系数效果很好，我得到了一个很好的结果，它与训练数据匹配得很好。其他时候返回的系数是可怕的（比输出应该高得多，而且相差几个数量级）。当我开始使用高阶项时尤其会发生这种情况：使用任何使用x^8或更高阶项（x^9、x^10、x^11等）的模型总是会产生不好的结果。

由于它有时会起作用，我想不出为什么我的实现会出错。我fminunc在提供渐变和不提供渐变的同时尝试过，但没有区别。我已经研究过使用其他函数来求解系数，例如polyfit，但在这种情况下，它必须具有从1最高阶项提升到的项，但我使用的模型在处具有最低功率2。

这是主要代码：

clear;

%Overall Constants
max_power = 7;

%Loads in data
%data = load('TestData.txt');
load testdata.mat

%Sets data into variables
indep_x = data(:,1); Y = data(:,2);

%number of data points
m = length(Y);

%X is a matrix with the independant variable
exps = [2:max_power];
X_prime = repmat(indep_x, 1, max_power-1); %Repeats columns of the indep var
X = bsxfun(@power, X_prime, exps);

%Initializes theta to rand vals
init_theta = rand(max_power-1,1);

%Sets up options for fminunc
options = optimset( 'MaxIter', 400, 'Algorithm', 'quasi-newton');

%fminunc minimizes the output of the cost function by changing the theta paramaeters
[theta, cost] = fminunc(@(t)(costFunction(t, X, Y)), init_theta, options)

%
Y_line = X * theta;

figure;
hold on; plot(indep_x, Y, 'or');
hold on; plot(indep_x, Y_line, 'bx');

这是costFunction：

function [J, Grad] = costFunction (theta, X, Y)
   %# of training examples

   m = length(Y);

    %Initialize Cost and Grad-Vector
    J = 0;
    Grad = zeros(size(theta));

    %Poduces an output based off the current values of theta
    model_output = X * theta;

    %Computes the squared error for each example then adds them to get the total error
    squared_error = (model_output - Y).^2;
    J = (1/(2*m)) * sum(squared_error);

    %Computes the gradients for each theta t
    for t = 1:size(theta, 1)
        Grad(t) = (1/m) * sum((model_output-Y) .* X(:, t));
    end

endfunction

任何帮助或建议将不胜感激。

score 1 · Accepted Answer

尝试将正则化添加到您的 costFunction：

function [J, Grad] = costFunction (theta, X, Y, lambda)
    m = length(Y);

    %Initialize Cost and Grad-Vector
    J = 0;
    Grad = zeros(size(theta));

    %Poduces an output based off the current values of theta
    model_output = X * theta;

    %Computes the squared error for each example then adds them to get the total error
    squared_error = (model_output - Y).^2;
    J = (1/(2*m)) * sum(squared_error);
    % Regularization
    J = J + lambda*sum(theta(2:end).^2)/(2*m);


    %Computes the gradients for each theta t
    regularizator = lambda*theta/m;
    % overwrite 1st element i.e the one corresponding to theta zero
    regularizator(1) = 0;
    for t = 1:size(theta, 1)
        Grad(t) = (1/m) * sum((model_output-Y) .* X(:, t)) + regularizator(t);
    end

endfunction

正则化项lambda用于控制学习率。从 lambda=1 开始。lambda 的值越大，学习速度就越慢。如果您描述的行为仍然存在，请增加 lambda。如果 lambda 变高，您可能需要增加迭代次数。您还可以考虑对数据进行规范化，以及一些用于初始化 theta 的启发式方法 - 将所有 theta 设置为 0.1 可能比随机更好。如果不出意外，它将提供从培训到培训的更好的可重复性。

matlab - [Octave]使用 fminunc 并不总是给出一致的解决方案

1 回答 1

Related

Reference