我的理解是 host_call 和 host_call_fn() 将统计信息从 TPU 传输到主机。但是,关于如何为任何非标量生成摘要的说明不是很清楚。
例如,我尝试修改官方的 mnist_tpu.py 以生成训练期间产生的梯度的摘要。model_fn() 是添加更改的地方:
...
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
if FLAGS.use_tpu:
optimizer = tf.contrib.tpu.CrossShardOptimizer(optimizer)
grads = optimizer.compute_gradients(loss)
train_op = optimizer.apply_gradients(grads, global_step)
if not FLAGS.skip_host_call:
def host_call_fn(gs, loss, lr, grads):
gs = gs[0]
with summary.create_file_write(FLAGS.model_dir).as_default():
summary.scalar('loss', loss[0], step=gs)
summary.scalar('learning_rate', lr[0], step=gs)
for index, grad in enumerate(grads):
summary.histogram('{}-grad'.format(grads[index][1].name),
grads[index])
return summary.all_summary_ops()
gs_t = tf.reshape(global_step, [1])
loss_t = tf.reshape(loss, [1])
lr_t = tf.reshape(learning_rate, [1])
grads_t = grads
host_call = (host_call_fn, [gs_t, loss_t, lr_t, grads_t])
return tf.contrib.tpu.TPUEstimatorSpec(
mode=mode,
loss=loss,
train_op=train_op
)
....
不幸的是,上面的添加似乎并没有像在基于 CPU 的训练期间生成直方图那样发挥作用。知道如何在非标量张量上正确生成直方图吗?