caffe - caffe 是否将正则化参数乘以有偏？

Question

我有很多关于正则化和偏见在 caffe 中的工作方式的问题。

首先，默认情况下网络中存在偏见，对吗？或者，我需要让 caffe 添加它们？

其次，在获取损失值时，不考虑正则化。这样对吗？我的意思是损失只包含损失函数值。据我了解，它只考虑梯度计算中的正则化。这样对吗？

第三，caffe在获取梯度时，是否也考虑了正则化中的biased值？还是只考虑网络在正则化中的权重？

提前致谢，

阿夫欣

score 1 · Accepted Answer

对于你的 3 个问题，我的回答是：

是的。默认情况下，网络中确实存在偏差。例如，在ConvolutionParameter和InnerProductParameter中caffe.proto，bias_term的默认值为true，表示convolution/innerproduct网络中的层默认会有偏差。
是的。损失层得到的损失值不包含正则化项的值。它只考虑调用函数后的正则化net_->ForwardBackward()，实际上是在ApplyUpdate()函数中更新网络参数的地方。

以网络中的卷积层为例：

layer {
  name: "SomeLayer"
  type: "Convolution"
  bottom: "data"
  top: "conv"
  #for weights
  param {
    lr_mult: 1 
    decay_mult: 1.0 #coefficient of regularization for weights
                    #default is 1.0, here is for the sake of clarity  
  }
  #for bias
  param {
    lr_mult: 2
    decay_mult: 1.0 #coefficient of regularization for bias
                    #default is 1.0, here is for the sake of clarity 
  } 
  ...  #left 
}

这个问题的答案是：当 caffe 获得梯度时，只有当 2 个变量：decay_mult上面的第二个和里面的weight_decay都solver.prototxt大于零时，求解器才会考虑正则化中的偏差值。

详细信息可以在 functoin void SGDSolver::Regularize()中找到。

希望这会帮助你。

caffe - caffe 是否将正则化参数乘以有偏？

1 回答 1

Related

Reference