我在本地和云端都成功地训练了我的https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/census模型/实验。而且我能够在云中部署我的示例并运行预测。
但是,如果我想在本地运行我的预测——而不是在云端——我该怎么做呢?
我是新手,但我尝试了几种幼稚的方法,但都失败了,请参阅下面的 3 个具体方法。
欢迎任何提示或引用片段。
:-)
M。
** 原帖中关于方法 #1 的更新**
如果我包括单行;
c = tf.contrib.learn.DNNLinearCombinedClassifier(model_dir=job_dir)
我收到一个错误,请参阅下面的错误 #a。
如果我天真地编辑调用以包含缺少的参数,则构造函数可以工作,但是如果我调用 predict 失败并出现错误 #b,请参见下文。我将 model.py 中的 wide_columns 和 deep_columns 设为全局,并将上面的行修改为
c = tf.contrib.learn.DNNLinearCombinedClassifier(model_dir=job_dir, linear_feature_columns=model.wide_columns, dnn_feature_columns=model.deep_columns)
我的 pycharm 调试器确认 model.wide_columns 和 model.deep_columns 在调用时已实例化/不为空。
现在这导致了一个“空”分类器。我不相信 DNNLinearCombinedClassifier 会从我的 job_dir 中获取任何模型内容。我会包括检查分类器的屏幕截图,同时在 model.py build_estimator() 中实例化(我也将它变成了一个变量 c,并且有一个断点)和 task.py 中的上述 c,但我由于我缺乏声誉,github 不允许 m。但区别很明显——例如,对于恢复的分类器,c->params->dnn_hidden_units 是空的,但使用原始分类器实例化 ([100,70,48,34])。
我为 job_dir(称为输出)包含一个 ls -R,请参见下面的 #c。
我为每次运行执行 rm -rf 输出,因此 job_dir 是干净的。
显然我在某个地方犯了错误,但由于缺乏洞察力,我无法看到在哪里。任何进一步的建议表示赞赏。
:-)
M。
---------------------- 控制台输出(更新) ---------- ----
一个。
Starting Census: Please lauch tensorboard to see results:
tensorboard --logdir=$MODEL_DIR
2017-05-30 12:14:10.570030: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 12:14:10.570042: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 12:14:10.570046: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "<..>/trainer/task.py", line 199, in <module>
c = tf.contrib.learn.DNNLinearCombinedClassifier(model_dir=job_dir)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 335, in new_func
return func(*args, **kwargs)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 597, in __init__
raise ValueError("Either linear_feature_columns or dnn_feature_columns "
ValueError: Either linear_feature_columns or dnn_feature_columns must be defined.
Process finished with exit code 1
湾。
Starting Census: Please lauch tensorboard to see results:
tensorboard --logdir=$MODEL_DIR
2017-05-30 12:31:47.967638: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 12:31:47.967650: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-30 12:31:47.967653: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
File "<..>/repository/git/13cx/subject-matter/google-cloud/1705cloudml/170530local-save/trainer/task.py", line 206, in <module>
p = c.predict(input_fn=eval2_input_fn)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 335, in new_func
return func(*args, **kwargs)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 335, in new_func
return func(*args, **kwargs)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 660, in predict
as_iterable=as_iterable)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 335, in new_func
return func(*args, **kwargs)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 695, in predict_classes
as_iterable=as_iterable)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/python/util/deprecation.py", line 281, in new_func
return func(*args, **kwargs)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 565, in predict
as_iterable=as_iterable)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 857, in _infer_model
infer_ops = self._get_predict_ops(features)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1188, in _get_predict_ops
return self._call_model_fn(features, labels, model_fn_lib.ModeKeys.INFER)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/estimator.py", line 1103, in _call_model_fn
model_fn_results = self._model_fn(features, labels, **kwargs)
File "<..>/.local/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py", line 201, in _dnn_linear_combined_model_fn
"dnn_hidden_units must be defined when dnn_feature_columns is "
ValueError: dnn_hidden_units must be defined when dnn_feature_columns is specified.
Process finished with exit code 1
C。
$ ls -R output/
output/:
checkpoint graph.pbtxt model.ckpt-2.data-00000-of-00001
eval model.ckpt-1000.data-00000-of-00001 model.ckpt-2.index
events.out.tfevents.1496140978.yarc-mainlinux model.ckpt-1000.index model.ckpt-2.meta
export model.ckpt-1000.meta
output/eval:
events.out.tfevents.1496140982.yarc-mainlinux events.out.tfevents.1496140987.yarc-mainlinux
output/export:
Servo
output/export/Servo:
1496140989
output/export/Servo/1496140989:
saved_model.pb variables
output/export/Servo/1496140989/variables:
variables.data-00000-of-00001 variables.index
----------** 原帖**----------
--------我尝试过的东西------------
请参阅底部的代码,参考 1、2、3..
使用指向模型存储位置的 model_dir 参数重新实例化 DNNLinearCombinedClassifier。计划是运行分类器的预测方法。我无法让分类器反映保存的模型。
通过 saver.restore() 恢复模型。这有效,但我不明白如何从那里开始。由于缺乏对张量流的洞察力,我猜。
产生一些用于方法 1 的测试数据。张量的评估永远不会退出。如何评估输入批次,以便将其视为矩阵?
--------- 随附代码 -----------------
(此代码只是附加到 trainer/task.py 的末尾)
# last original line from task.py:
learn_runner.run(generate_experiment_fn(**arguments), job_dir)
# my stuff:
# 1. restore the classifier from model dir, fails
# c = tf.contrib.learn.DNNLinearCombinedClassifier(model_dir=job_dir)
# 2. restore model, works ok, but then how?
sess = tf.Session()
saver = tf.train.import_meta_graph('output/model.ckpt-1000.meta')
saver.restore(sess, tf.train.latest_checkpoint('./output/'))
sess.run(tf.global_variables_initializer())
print("Sanity check, a variable instance {}".format(
sess.run('dnn/input_from_feature_columns/education_embedding/weights/part_0:0')))
sess.close()
# 3. produce some test input (we're for simplicity reusing the eval set), apparently works, but an evaluation hangs forever
eval2_input_fn = model.generate_input_fn(
arguments['eval_files'],
batch_size=arguments['eval_batch_size'],
shuffle=False
)
# 3a. inspecting some input, the evaluation never ends.
input = eval2_input_fn()
print("input: {}".format(input))
with tf.Session() as sess:
evalinput = input[1].eval()
print("evalinput: {}".format(evalinput))
print("\nDone")