我正在 Google Cloud 的 AI 平台上使用 TensorFlow 训练模型,虽然训练本身进行得很好,但我无法将已完成的模型以 SavedModel 格式保存到我的云存储桶中。我知道存储桶设置正确,因为在训练开始时我从同一个存储桶下载我的训练数据。这是我用来保存模型的代码:
SAVE_PATH = os.path.join("gs://", 'machine-learning-ebay', 'job-dir')
linear_model.save(SAVE_PATH)
其中“machine-learning-ebay”是存储桶,“job-dir”是该存储桶中的一个文件夹。
我在谷歌云的职位描述页面上收到以下错误:
Traceback (most recent call last):
[...]
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py", line 1219, in save
file_prefix_tensor, object_graph_tensor, options)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/training/tracking/util.py", line 1164, in _save_cached_when_graph_building
save_op = saver.save(file_prefix, options=options)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/training/saving/functional_saver.py", line 300, in save
return save_fn()
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/training/saving/functional_saver.py", line 287, in save_fn
sharded_prefixes, file_prefix, delete_old_dirs=True)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 504, in merge_v2_checkpoints
delete_old_dirs=delete_old_dirs, name=name, ctx=_ctx)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 528, in merge_v2_checkpoints_eager_fallback
attrs=_attrs, ctx=ctx, name=name)
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
inputs, attrs, num_outputs)
tensorflow.python.framework.errors_impl.NotFoundError: Error executing an HTTP request: HTTP response code 404 with body '{
"error": {
"code": 404,
"message": "No such object: machine-learning-ebay/job-dir/variables/variables_temp/part-00000-of-00001.data-00000-of-00001",
"errors": [
{
"message": "No such object: machine-learning-ebay/job-dir/variables/variables_temp/part-00000-of-00001.data-00000-of-00001",
"domain": "global",
"reason": "notFound"
}
]
}
}
任何帮助是极大的赞赏; 这个项目的截止日期是今天。