2

我写了以下代码:

tf.contrib.slim.learning.train(
    ...
    ...
    session_wrapper=tensorflow.python.debug.LocalCLIDebugWrapperSession,
    ...)

运行代码时,它报告:

......
......
2018-02-14 01:03:25.229477: I tensorflow/core/debug/debug_graph_utils.cc:229] For debugging, tfdbg is changing the parallel_iterations attribute of the Enter/RefEnter node "lstm/lstm_1/while/Enter_2" on device "/job:localhost/replica:0/task:0/device:CPU:0" from 32 to 1. (This does not affect subsequent non-debug runs.)
Traceback (most recent call last):
  File "train_getimageid_ngch.py", line 147, in <module>
    tf.app.run()
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "train_getimageid_ngch.py", line 143, in main
    saver=saver)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 775, in train
    sv.stop(threads, close_summary_writer=True)
  File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
    self.stop(close_summary_writer=close_summary_writer)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop
    stop_grace_period_secs=self._stop_grace_secs)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
    six.reraise(*self._exc_info_to_raise)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 296, in stop_on_exception
    yield
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 494, in run
    self.run_loop()
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 1068, in run_loop
    global_step=self._sv.global_step)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/saver.py", line 1549, in save
    global_step = training_util.global_step(sess, global_step)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/training/training_util.py", line 67, in global_step
    return int(sess.run(global_step_tensor))
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/debug/wrappers/framework.py", line 543, in run
    run_end_resp = self.on_run_end(run_end_req)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/debug/wrappers/local_cli_wrapper.py", line 321, in on_run_end
    self._dump_root, partition_graphs=partition_graphs)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/debug_data.py", line 495, in __init__
    self._load_all_device_dumps(partition_graphs, validate)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/debug_data.py", line 517, in _load_all_device_dumps
    self._load_partition_graphs(partition_graphs, validate)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/debug_data.py", line 798, in _load_partition_graphs
    self._validate_dump_with_graphs(debug_graph.device_name)
  File "/home/ngaimanchow/tensorflow_virtualenv/local/lib/python2.7/site-packages/tensorflow/python/debug/lib/debug_data.py", line 843, in _validate_dump_with_graphs
    "device %s." % (datum.node_name, device_name))
ValueError: Node name 'TFRecordReaderV2' is not found in partition graphs of device /job:localhost/replica:0/task:0/device:CPU:0.

即使它可以启动 tf 调试器,当试图打印一个张量的值时,即 pt 命令,它也报告了错误:

处理命令时出错:“print_tensor”异常。KeyError /device:CPU:0

tf_debug 有没有可能真正支持 slim?如何解决?如果没有,在使用 tf.contrib.slim.learning.train 时,还有其他方法可以调试或打印张量的值吗?

4

0 回答 0