3

最近在做人脸对齐(人脸界标检测)的研究,想对开源的Mnemonic Descent Method做进一步的工作。根据代码,我对样本的导入做了一些修改。但是还有一些其他的问题真的让我困惑了一段时间。

首先,模型如下

  patches = _extract_patches_module.extract_patches(images, tf.constant(patch_shape), inits+dx)
  patches = tf.stop_gradient(patches)
  patches = tf.reshape(patches, (batch_size, num_patches * patch_shape[0], patch_shape[1], num_channels))
  endpoints['patches'] = patches

  with tf.variable_scope('convnet', reuse=step>0):
      net = conv_model(patches)
      ims = net['concat']

  ims = tf.reshape(ims, (batch_size, -1))

  with tf.variable_scope('rnn', reuse=step>0) as scope:
      hidden_state = slim.ops.fc(tf.concat(1, [ims, hidden_state]), 512, activation=tf.tanh)
      prediction = slim.ops.fc(hidden_state, num_patches * 2, scope='pred', activation=None)
      endpoints['prediction'] = prediction`

conv_model 是:

with tf.op_scope([inputs], scope, 'mdm_conv'):
with scopes.arg_scope([ops.conv2d, ops.fc], is_training=is_training):
  with scopes.arg_scope([ops.conv2d], activation=tf.nn.relu, padding='VALID'):
    net['conv_1'] = ops.conv2d(inputs, 32, [3, 3], scope='conv_1')
    net['pool_1'] = ops.max_pool(net['conv_1'], [2, 2])
    net['conv_2'] = ops.conv2d(net['pool_1'], 32, [3, 3], scope='conv_2')
    net['pool_2'] = ops.max_pool(net['conv_2'], [2, 2])

    crop_size = net['pool_2'].get_shape().as_list()[1:3]
    net['conv_2_cropped'] = utils.get_central_crop(net['conv_2'], box=crop_size)
    net['concat'] = tf.concat(3, [net['conv_2_cropped'], net['pool_2']])
    return net

初始学习率设置为1e-3,批量大小为60

第一个问题是模型在训练过程中,loss几乎保持不变,即loss一般不会减少,即使迭代超过10000步。像这样的情况:

  2017-06-01 19:46:01.120850: step 3060, loss = 0.8852 (15.7 examples/sec; 3.830 sec/batch)
  2017-06-01 19:46:37.776494: step 3070, loss = 0.7375 (18.2 examples/sec; 3.291 sec/batch)
  2017-06-01 19:47:09.242257: step 3080, loss = 0.8160 (16.5 examples/sec; 3.635 sec/batch)
  2017-06-01 19:47:46.441860: step 3090, loss = 0.7973 (17.1 examples/sec; 3.501 sec/batch)
  2017-06-01 19:48:19.793012: step 3100, loss = 0.7228 (18.2 examples/sec; 3.292 sec/batch)
  2017-06-01 19:48:56.614480: step 3110, loss = 0.8687 (21.8 examples/sec; 2.750 sec/batch)
  2017-06-01 19:49:29.904451: step 3120, loss = 0.8662 (19.8 examples/sec; 3.024 sec/batch)
  2017-06-01 19:50:06.186441: step 3130, loss = 0.7927 (22.7 examples/sec; 2.648 sec/batch)
  2017-06-01 19:50:40.794964: step 3140, loss = 0.7585 (16.2 examples/sec; 3.711 sec/batch)
  2017-06-01 19:51:18.612637: step 3150, loss = 0.8264 (17.9 examples/sec; 3.348 sec/batch)
  2017-06-01 19:51:52.905742: step 3160, loss = 0.7504 (17.2 examples/sec; 3.498 sec/batch)
  2017-06-01 19:52:29.895365: step 3170, loss = 0.7569 (16.6 examples/sec; 3.615 sec/batch)
  2017-06-01 19:53:03.509374: step 3180, loss = 0.6869 (16.3 examples/sec; 3.692 sec/batch)
  2017-06-01 19:53:40.798535: step 3190, loss = 0.7592 (18.9 examples/sec; 3.180 sec/batch)
  2017-06-01 19:54:14.063566: step 3200, loss = 0.7689 (19.1 examples/sec; 3.136 sec/batch)
  2017-06-01 19:54:50.741630: step 3210, loss = 0.7345 (19.7 examples/sec; 3.040 sec/batch)

损失函数为:

def normalized_rmse(pred, gt_truth):
    norm = tf.sqrt(1e-12 + tf.reduce_sum(((gt_truth[:, 36, :] - gt_truth[:, 45, :])**2), 1))

    return tf.reduce_sum(tf.sqrt(1e-12 + tf.reduce_sum(tf.square(pred - gt_truth), 2)), 1) / (norm * 68)

实际上,训练数据集中有 3000 多张图像。增强之后,从同一图像增强的每个图像在某些方面都有一些差异。因此,每批中的样品是不同的。

但是,当我只用一张图像训练模型时,模型可以在大约 1000 步后收敛,即损失可以明显减少并接近于零,这让我现在真的很困惑......

然后我使用张量板来可视化结果。结果如下:

损失变化 损失变化

权重和偏差的梯度 权重和偏差的梯度

结果还表明,损失剂量普遍没有减少。同时揭示了第二个问题:卷积模型中bias的梯度变化明显,而权重保持不变!即使模型只用一张图像训练,经过 1000 步后就可以收敛,卷积模型中对应的权重梯度也保持不变......

我是tensorflow的新手,我已经尽力解决这些问题,但最终失败了......所以我真诚地希望大家能帮助我......非常感谢!

4

0 回答 0