0

我正在尝试实现一个自写的损失函数。我的管道如下

x -> {常量计算} = x_feature ->机器学习训练-> y_feature -> {常量计算} = y_produced

这些“恒定计算”对于找出所需的 o/p 和产生的 o/p 之间的差异是必要的。

所以如果我取y_produced和y_original的L2范数,我应该如何将这个损失纳入原始损失中。

请注意 与y_produced具有不同的维度y_feature

4

1 回答 1

0

As long as you are using differentiable operations there is no difference between "constant transformations" and "learnable ones". There is no such distinction, look even at the linear layer of a neural net

f(x) = sigmoid( W * x + b )

is it constant or learnable? W and b are trained, but "sigmoid" is not, yet gradient flows the same way, no matter if something is a variable or not. In particular gradient wrt. to x is the same for

g(x) = sigmoid( A * x + c )

where A and c are constants.

The only problem you will encounter is using non-differentiable operations, such as: argmax, sorting, indexing, sampling etc. these operations do not have a well defined gradient thus you cannot directly use first order optimisers with them. As long as you stick with the differentiable ones - the problem described does not really exist - there is no difference between "constant transromations" and any other transformations - no matter change of the size etc.

于 2017-07-06T23:02:21.723 回答