1

我一直在使用 BERT large uncased 进行二进制文本分类。我正在使用 google colab 进行训练模型。为了保存估算器,我使用了以下方法serving_input_funtion

def serving_input_receiver_fn():
  with tf.variable_scope("foo"):
    feature_spec = {
      "input_ids": tf.FixedLenFeature([128], tf.int64),
      "input_mask": tf.FixedLenFeature([128], tf.int64),
      "segment_ids": tf.FixedLenFeature([128], tf.int64),
      "label_ids": tf.FixedLenFeature([], tf.int64),
    }
    serialized_tf_example = tf.placeholder(dtype=tf.string, shape=None,
                                       name='input_example_tensor')

    receiver_tensors = {'examples': serialized_tf_example}
    features = tf.parse_example(serialized_tf_example, feature_spec)
    return tf.estimator.export.ServingInputReceiver(features, receiver_tensors)

并保存了以下估算器:

estimator._export_to_tpu = False
estimator.export_saved_model(export_dir_base = "/bert_0.3/",serving_input_receiver_fn = serving_input_receiver_fn)

这保存了估计器,但现在当我试图用来saved_model_cli测试估计器时,它不起作用。它向我抛出错误,例如:

ValueError: Type <class 'bytes'> for value b'\n\x12\n\x10\n\x08sentence\x12\x04\n\x02\n\x00' is not supported for tf.train.Feature.

在此处输入图像描述

命令是:

saved_model_cli run --dir '/bert_0.3/1564572852' --tag_set serve --signature_def serving_default --input_examples '"examples"=[{"input_ids":[b"\n\x12\n\x10\n\x08sentence\x12\x04\n\x02\n\x00"],"input_mask":[b"\n-\n+\n\x08sentence\x12\x1f\n\x1d\n\x1bThis API is a little tricky"],"segment_ids":[None],"label_ids":["label_1"]}]'

在此处输入图像描述

它不直接接受字符串,所以这就是我手动将其划分为dict. 将其转换为 后dict,我意识到这仅接受bytes_list. 这就是我将字符串转换为字节格式的原因。

在此处输入图像描述

堆栈跟踪

 File "/usr/local/bin/saved_model_cli", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/tools/saved_model_cli.py", line 990, in main
    args.func(args)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/tools/saved_model_cli.py", line 724, in run
    init_tpu=args.init_tpu, tf_debug=args.tf_debug)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/tools/saved_model_cli.py", line 420, in run_saved_model_with_feed_dict
    loader.load(sess, tag_set.split(','), saved_model_dir)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/util/deprecation.py", line 324, in new_func
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/saved_model/loader_impl.py", line 269, in load
    return loader.load(sess, tags, import_scope, **saver_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/saved_model/loader_impl.py", line 423, in load
    self.restore_variables(sess, saver, import_scope)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/saved_model/loader_impl.py", line 377, in restore_variables
    saver.restore(sess, self._variables_path)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/training/saver.py", line 1290, in restore
    {self.saver_def.filename_tensor_name: save_path})
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/client/session.py", line 956, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/client/session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/client/session.py", line 1359, in _do_run
    run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow_core/python/client/session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
     [[node save_2/RestoreV2 (defined at /lib/python2.7/dist-packages/tensorflow_core/python/framework/ops.py:1748) ]]

如果有人能告诉我哪里出了问题,我们将不胜感激。serving_input_receiver_fn我猜也可能有问题。

最大序列长度为 128

谢谢。

编辑 1

如果使用 tf.placeholder() 而不是 tf.FixedLenFeature

在此处输入图像描述

4

1 回答 1

-1

您可以使用以下命令查看保存的模型:

saved_model_cli show --all --dir <path to/bert_0.3/1564572852>

这将显示输入和输出的数据类型、形状和名称。

请尝试在 serving_input_receiver_fn() 中使用 tf.placeholder() 而不是 tf.FixedLenFeature,如下所示:

input_ids = tf.placeholder(tf.int32, [None, 128], name='input_ids')
input_mask = tf.placeholder(tf.int32, [None, 128], name='input_mask')
segment_ids = tf.placeholder(tf.int32, [None, 128], name='segment_ids')

如果在使用保存的模型时仍然出现错误,请分享错误截图。

您可以参考以下 github 存储库以获取更多详细信息: https ://github.com/bigboNed3/bert_serving/tree/44d33920da6888cf91cb72e6c7b27c7b0c7d8815

希望这会有所帮助。谢谢。

于 2019-11-07T08:25:14.683 回答