我正在训练一个神经网络(在 C++ 中,没有任何额外的库),以学习一个随机摆动函数:
f(x)=0.2+0.4x2+0.3sin(15x)+0.05cos(50x)
在 Python 中绘制为:
lim = 500
for i in range(lim):
x.append(i)
p = 2*3.14*i/lim
y.append(0.2+0.4*(p*p)+0.3*p*math.sin(15*p)+0.05*math.cos(50*p))
plt.plot(x,y)
对应的曲线为:
同一个神经网络已经成功地用一个隐藏层(5 个神经元) tanh activation很好地逼近了正弦函数。但是,我无法理解 wiggly 函数出了什么问题。尽管均方误差似乎有所下降。(**为了可见性,误差已按比例放大 100):
我怀疑正常化。我是这样做的:
生成的训练数据为:
int numTrainingSets = 100;
double MAXX = -9999999999999999;
for (int i = 0; i < numTrainingSets; i++)
{
double p = (2*PI*(double)i/numTrainingSets);
training_inputs[i][0] = p; //INSERTING DATA INTO i'th EXAMPLE, 0th INPUT (Single input)
training_outputs[i][0] = 0.2+0.4*pow(p, 2)+0.3*p*sin(15*p)+0.05*cos(50*p); //Single output
///FINDING NORMALIZING FACTOR (IN INPUT AND OUTPUT DATA)
for(int m=0; m<numInputs; ++m)
if(MAXX < training_inputs[i][m])
MAXX = training_inputs[i][m]; //FINDING MAXIMUM VALUE IN INPUT DATA
for(int m=0; m<numOutputs; ++m)
if(MAXX < training_outputs[i][m])
MAXX = training_outputs[i][m]; //FINDING MAXIMUM VALUE IN OUTPUT DATA
///NORMALIZE BOTH INPUT & OUTPUT DATA USING THIS MAXIMUM VALUE
////DO THIS FOR INPUT TRAINING DATA
for(int m=0; m<numInputs; ++m)
training_inputs[i][m] /= MAXX;
////DO THIS FOR OUTPUT TRAINING DATA
for(int m=0; m<numOutputs; ++m)
training_outputs[i][m] /= MAXX;
}
这就是模型训练的内容。验证/测试数据生成如下:
int numTestSets = 500;
for (int i = 0; i < numTestSets; i++)
{
//NORMALIZING TEST DATA USING THE SAME "MAXX" VALUE
double p = (2*PI*i/numTestSets)/MAXX;
x.push_back(p); //FORMS THE X-AXIS FOR PLOTTING
///Actual Result
double res = 0.2+0.4*pow(p, 2)+0.3*p*sin(15*p)+0.05*cos(50*p);
y1.push_back(res); //FORMS THE GREEN CURVE FOR PLOTTING
///Predicted Value
double temp[1];
temp[0] = p;
y2.push_back(MAXX*predict(temp)); //FORMS THE RED CURVE FOR PLOTTING, scaled up to de-normalize
}
这正常化对吗?如果是,可能会出什么问题?如果没有,应该怎么做?