neural-network - 反向传播中的交叉熵、Softmax 和导数项

Question

我目前对在执行 BackPropagation 算法进行分类时使用交叉熵误差感兴趣，我在输出层中使用 Softmax 激活函数。

根据我的收集，您可以使用 Cross Entropy 和 Softmax 将导数去掉，使其看起来像这样：

Error = targetOutput[i] - layerOutput[i]

这不同于以下的均方误差：

Error = Derivative(layerOutput[i]) * (targetOutput[i] - layerOutput[i])

那么，当您的输出层使用 Softmax 激活函数进行交叉熵分类时，您是否可以只删除导数项？例如，如果我要使用交叉熵误差（比如 TANH 激活函数）进行回归，我仍然需要保留导数项，对吗？

我还没有找到一个明确的答案，我也没有试图计算出这方面的数学（因为我生疏了）。

score 1 · Accepted Answer

You do not use the derivative term in the output layer since you get the 'real' error (the difference between your output and your target), in the hidden layers you have to calculate the approximate error using backpropagation.

What we are doing is an approximation taking the derivate of the error of the next layer against the weights of the current layer instead of the error of the current layer (that its unknown).

Best regards,

neural-network - 反向传播中的交叉熵、Softmax 和导数项

1 回答 1

Related

Reference