0

我有一个有效的 TF 安装,而且 slim 也可以正常工作。

但是,当我尝试运行一个苗条的训练循环时,我的应用程序崩溃了。

最小代码:

import tensorflow as tf
import tensorflow.contrib.slim as slim


# Load data.
...


graph = tf.Graph()
with graph.as_default():

    # Build model
    ...

    # Add losses
    ...

    # Create training operation and start the actual training loop.
    train_op = ...

    # Start training loop

    slim.learning.train(
        train_op,
        logdir=FLAGS.logdir,
        save_summaries_secs=FLAGS.save_summaries_secs,
        save_interval_secs=FLAGS.save_interval_secs,
        master=FLAGS.master,
        is_chief=(FLAGS.task == 0),
        startup_delay_steps=(FLAGS.task * 20),
        log_every_n_steps=FLAGS.log_every_n_steps)

当我运行它时,我得到:

E tensorflow/core/common_runtime/session.cc:69] Not found: No session factory registered for the given session options: {target: "local" config: } Registered factories are {DIRECT_SESSION, GRPC_SESSION}.
Traceback (most recent call last):
File "tensorflow/tensorflow/contrib/my_package/python/my_package/train.py", line 467, in <module>
    app.run()
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
File "tensorflow/tensorflow/contrib/my_package/python/my_package/train.py", line 462, in main
    log_every_n_steps=FLAGS.log_every_n_steps)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 776, in train
    master, start_standard_services=False, config=session_config) as sess:
File "/usr/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 973, in managed_session
    self.stop(close_summary_writer=close_summary_writer)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 801, in stop
    stop_grace_period_secs=self._stop_grace_secs)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 386, in join
    six.reraise(*self._exc_info_to_raise)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 962, in managed_session
    start_standard_services=start_standard_services)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 719, in prepare_or_wait_for_session
    init_feed_dict=self._init_feed_dict, init_fn=self._init_fn)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/training/session_manager.py", line 256, in prepare_session
    config=config)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/training/session_manager.py", line 161, in _restore_checkpoint
    sess = session.Session(self._target, graph=self._graph, config=config)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1187, in __init__
    super(Session, self).__init__(target, graph, config=config)
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 552, in __init__
    self._session = tf_session.TF_NewDeprecatedSession(opts, status)
File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
File "$HOME/.local/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 469, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.NotFoundError: No session factory registered for the given session options: {target: "local" config: } Registered factories are {DIRECT_SESSION, GRPC_SESSION}.

相反,当train_op被称为“手动”时,相同的模型将进行训练:

with tf.Session(graph=graph) as sess:
    tf.global_variables_initializer().run()

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)

    for step in xrange(FLAGS.max_steps):
    _, summaries = sess.run([train_op, summary_op])
    ...
    coord.request_stop()
    coord.join(threads)

有谁知道从哪里开始调试?

谢谢你,菲利普

4

1 回答 1

1

看起来这条线引起了问题:

    master=FLAGS.master,

从错误消息来看,Slim 似乎正在尝试将会话创建为sess = tf.Session("local"),这不是有效的会话目标。--master=""尝试在运行脚本时传递标志,或master=""在调用时显式设置slim.learning.train().

于 2017-01-23T15:40:30.323 回答