machine-learning - 为什么 RNN 使用交叉熵作为损失函数

Question

我对神经网络非常陌生，想知道为什么所有 RNN 示例，尤其是 char-rnns 都使用交叉熵损失函数作为其损失函数。我已经用谷歌搜索，但似乎无法在这种情况下遇到任何关于该功能的讨论。我被要求鼓励使用它并查看它的优点和缺点，因此我可以阅读的任何论文或资料将不胜感激。

score 4 · Accepted Answer

Many sequence-to-sequence RNNs, and char-rnn in particular, produce the result by one item at a time, in other words by solving a classification problem at each time step.

Cross-entropy loss is the main choice when doing a classification, no matter if it's a convolutional neural network (example), recurrent neural network (example) or an ordinary feed-forward neural network (example). If you were to write an RNN that solves a regression problem, you'd use a different loss function, such as L2 loss.

All of examples above are using tensorflow and tf.nn.softmax_cross_entropy_with_logits loss.

machine-learning - 为什么 RNN 使用交叉熵作为损失函数

1 回答 1

Related

Reference