python - deep learning - a number of naive questions about caffe

Question

I am trying to understand the basics of caffe, in particular to use with python.

My understanding is that the model definition (say a given neural net architecture) must be included in the '.prototxt' file.

And that when you train the model on data using the '.prototxt', you save the weights/model parameters to a '.caffemodel' file

Also, there is a difference between the '.prototxt' file used for training (which includes learning rate and regularization parameters) and the one used for testing/deployment, which does not include them.

Questions:

is it correct that the '.prototxt' is the basis for training and that the '.caffemodel' is the result of training (weights), using the '.prototxt' on the training data?
is it correct that there is a '.prototxt' for training and one for testing, and that there are only slight differences (learning rate and regularization factors on training), but that the nn architecture (assuming you use neural nets) is the same?

Apologies for such basic questions and possibly some very incorrect assumptions, I am doing some online research and the lines above summarize my understanding to date.

score 12 · Accepted Answer

让我们看一下 BVLC/caffe 提供的示例之一：bvlc_reference_caffenet.
您会注意到实际上有3 个 '.prototxt'文件：

train_val.prototxt：这个文件描述了训练阶段的网络架构。
depoly.prototxt：此文件描述了测试时的网络架构（“部署”）。
solver.prototxt：这个文件非常小，包含用于训练的“元参数”。例如，学习率策略、正则化等。

train_val.prototxt和代表的网络架构deploy.prototxt应该大体相似。两者之间的主要区别很少：

输入数据：在训练期间，通常使用一组预定义的输入进行训练/验证。因此，train_val通常包含一个显式的输入层，例如，"HDF5Data"层或"Data"层。另一方面，deploy通常事先并不知道它会得到什么输入，它只包含一个语句：
```
input: "data"
input_shape {
  dim: 10
  dim: 3
  dim: 227
  dim: 227
}
```
它声明了网络期望的输入以及它的维度。
或者，可以放置"Input"一层：
```
layer {
  name: "input"
  type: "Input"
  top: "data"
  input_param { shape { dim: 10 dim: 3 dim: 227 dim: 227 } }
}
```
输入标签：在训练期间，我们为网络提供“基本事实”预期输出，这些信息显然在deploy.
损失层：在训练期间必须定义一个损失层。这一层告诉求解器在每次迭代时它应该在哪个方向调整参数。这种损失将网络的当前预测与预期的“基本事实”进行比较。损失的梯度被反向传播到网络的其余部分，这就是驱动学习过程的原因。期间deploy没有损失，也没有反向传播。

在 caffe 中，您提供train_val.prototxt描述网络、训练/验证数据集和损失的描述。此外，您还提供了一个solver.prototxt描述训练的元参数。训练过程的输出是一个.caffemodel包含网络训练参数的二进制文件。
一旦网络被训练，你可以使用deploy.prototxt参数.caffemodel来预测新的和看不见的输入的输出。

score 0 · Accepted Answer

是的，但是有不同类型的 .prototxt 文件，例如

https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_train_test.prototxt

这是用于训练和测试网络

对于命令行训练，ypu 可以使用也是 .prototxt 文件的求解器文件，例如

https://github.com/BVLC/caffe/blob/master/examples/mnist/lenet_solver.prototxt

python - deep learning - a number of naive questions about caffe

2 回答 2

Related

Reference