8

I am trying to understand the basics of caffe, in particular to use with python.

My understanding is that the model definition (say a given neural net architecture) must be included in the '.prototxt' file.

And that when you train the model on data using the '.prototxt', you save the weights/model parameters to a '.caffemodel' file

Also, there is a difference between the '.prototxt' file used for training (which includes learning rate and regularization parameters) and the one used for testing/deployment, which does not include them.

Questions:

  1. is it correct that the '.prototxt' is the basis for training and that the '.caffemodel' is the result of training (weights), using the '.prototxt' on the training data?
  2. is it correct that there is a '.prototxt' for training and one for testing, and that there are only slight differences (learning rate and regularization factors on training), but that the nn architecture (assuming you use neural nets) is the same?

Apologies for such basic questions and possibly some very incorrect assumptions, I am doing some online research and the lines above summarize my understanding to date.

4

2 回答 2

12

让我们看一下 BVLC/caffe 提供的示例之一:bvlc_reference_caffenet.
您会注意到实际上有3 个 '.prototxt'文件:

train_val.prototxt和代表的网络架构deploy.prototxt应该大体相似。两者之间的主要区别很少:

  • 输入数据:在训练期间,通常使用一组预定义的输入进行训练/验证。因此,train_val通常包含一个显式的输入层,例如,"HDF5Data"层或"Data"层。另一方面,deploy通常事先并不知道它会得到什么输入,它只包含一个语句:

    input: "data"
    input_shape {
      dim: 10
      dim: 3
      dim: 227
      dim: 227
    }
    

    它声明了网络期望的输入以及它的维度。
    或者,可以放置"Input"一层:

    layer {
      name: "input"
      type: "Input"
      top: "data"
      input_param { shape { dim: 10 dim: 3 dim: 227 dim: 227 } }
    }
    
  • 输入标签:在训练期间,我们为网络提供“基本事实”预期输出,这些信息显然在deploy.
  • 损失层:在训练期间必须定义一个损失层。这一层告诉求解器在每次迭代时它应该在哪个方向调整参数。这种损失将网络的当前预测与预期的“基本事实”进行比较。损失的梯度被反向传播到网络的其余部分,这就是驱动学习过程的原因。期间deploy没有损失,也没有反向传播。

在 caffe 中,您提供train_val.prototxt描述网络、训练/验证数据集和损失的描述。此外,您还提供了一个solver.prototxt描述训练的元参数。训练过程的输出是一个.caffemodel包含网络训练参数的二进制文件。
一旦网络被训练,你可以使用deploy.prototxt参数.caffemodel来预测新的和看不见的输入的输出。

于 2016-01-24T07:46:58.950 回答
0

是的,但是有不同类型的 .prototxt 文件,例如

https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt

这是用于训练和测试网络

对于命令行训练,ypu 可以使用也是 .prototxt 文件的求解器文件,例如

https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_solver.prototxt

于 2016-01-24T07:54:19.223 回答