1

我尝试使用 Phil Brierley (www.philbrierley.com) 的 NN 反向传播代码的修改版本。当我尝试解决 XOR 问题时,它可以完美运行。但是当我尝试解决形式输出 = x1^2 + x2^2 (输出 = 输入平方和)的问题时,结果并不准确。我已经在 -1 和 1 之间缩放了输入和输出。每次运行相同的程序时我都会得到不同的结果(我理解它是由于随机 wts 初始化),但结果非常不同。我尝试改变学习率,但结果仍然收敛。

给出了下面的代码

%---------------------------------------------------------
% MATLAB neural network backprop code
% by Phil Brierley
%--------------------------------------------------------
clear; clc; close all;


%user specified values
hidden_neurons = 4;
epochs = 20000;

input = [];
for i =-10:2.5:10
for j = -10:2.5:10
input = [input;i j];
end
end

output  = (input(:,1).^2 + input(:,2).^2);
output1 = output;

% Maximum input and output limit and scaling factors
m1  = -10;   m2  = 10;
m3  = 0;     m4  = 250;
c = -1;      d   = 1;

%Scale input and output
for i =1:size(input,2)
I   = input(:,i);
scaledI = ((d-c)*(I-m1) ./ (m2-m1)) + c;
input(:,i) = scaledI;
end
for i =1:size(output,2)
I   = output(:,i);
scaledI = ((d-c)*(I-m3) ./ (m4-m3)) + c;
output(:,i) = scaledI;
end

train_inp = input;
train_out = output;

%read how many patterns and add bias
patterns  = size(train_inp,1);
train_inp = [train_inp ones(patterns,1)];

%read how many inputs and initialize learning rate
inputs = size(train_inp,2);
hlr    = 0.1;

%set initial random weights
weight_input_hidden = (randn(inputs,hidden_neurons) - 0.5)/10;
weight_hidden_output = (randn(1,hidden_neurons) - 0.5)/10;

%Training
err = zeros(1,epochs);

for iter = 1:epochs

alr = hlr;
blr = alr / 10;

%loop through the patterns, selecting randomly
for j = 1:patterns

%select a random pattern
patnum = round((rand * patterns) + 0.5);
if patnum > patterns
patnum = patterns;
elseif patnum < 1
patnum = 1;    
end

%set the current pattern
this_pat = train_inp(patnum,:);
act = train_out(patnum,1);

%calculate the current error for this pattern
hval = (tanh(this_pat*weight_input_hidden))';
pred = hval'*weight_hidden_output';
error = pred - act;

% adjust weight hidden - output
delta_HO = error.*blr .*hval;
weight_hidden_output = weight_hidden_output - delta_HO';

% adjust the weights input - hidden
delta_IH= alr.*error.*weight_hidden_output'.*(1-(hval.^2))*this_pat;
weight_input_hidden = weight_input_hidden - delta_IH';

end
% -- another epoch finished

%compute overall network error at end of each epoch
pred      = weight_hidden_output*tanh(train_inp*weight_input_hidden)';
error     = pred' - train_out;
err(iter) =  ((sum(error.^2))^0.5);

%stop if error is small
if err(iter) < 0.001
fprintf('converged at epoch: %d\n',iter);
break 
end

end

%Output after training
pred  = weight_hidden_output*tanh(train_inp*weight_input_hidden)';
Y     = m3 + (m4-m3)*(pred-c)./(d-c);

% Testing for a new set of input
input_test  = [6 -3.1; 0.5 1; -2 3; 3 -2; -4 5; 0.5 4; 6 1.5];
output_test = (input_test(:,1).^2 + input_test(:,2).^2);
input1      = input_test;

%Scale input
for i =1:size(input1,2)
I   = input1(:,i);
scaledI = ((d-c)*(I-m1) ./ (m2-m1)) + c;
input1(:,i) = scaledI;
end

%Predict output
train_inp1 = input1;
patterns   = size(train_inp1,1);
bias       = ones(patterns,1);
train_inp1 = [train_inp1 bias];
pred1      = weight_hidden_output*tanh(train_inp1*weight_input_hidden)';

%Rescale
Y1  = m3 + (m4-m3)*(pred1-c)./(d-c);

analy_numer = [output_test Y1']
plot(err)

这是我得到的问题示例输出

20000 epochs 之后的状态

analy_numer =

45.6100   46.3174
1.2500   -2.9457
13.0000   11.9958
13.0000    9.7097
41.0000   44.9447
16.2500   17.1100
38.2500   43.9815

如果我再跑一次,我会得到不同的结果。正如对于输入的小值所观察到的那样,我得到了完全错误的答案(不可能的负答案)。对于其他值,准确性仍然很差。

有人可以告诉我做错了什么以及如何纠正。

谢谢拉曼

4

0 回答 0