python - 翻译模型预测：TypeError：“EagerTensor”类型的对象不是 JSON 可序列化的

Question

我按照Google 的 tensor2tensor 存储库的建议遵循了翻译 colab 笔记本教程

导出模型并将其上传到 Google 的 AI Platform 引擎进行在线预测后，我无法向模型发出请求。

我相信翻译模型的输入是源文本的张量。但我收到一个错误TypeError: Object of type 'EagerTensor' is not JSON serializable


def encode(input_str, output_str=None):
  """Input str to features dict, ready for inference"""
  inputs = encoders["inputs"].encode(input_str) + [1]  # add EOS id
  batch_inputs = tf.reshape(inputs, [1, -1, 1])  # Make it 3D.
  return {"inputs": batch_inputs}

enfr_problem = problems.problem(PROBLEM)
encoders = enfr_problem.feature_encoders(DATA_DIR)

encoded_inputs = encode("Some text")
model_output = predict_json('project_name','model_name', encoded_inputs,'version_1')["outputs"]

我尝试将张量转换为 numpy 但仍然没有运气。有人能指出我正确的方向吗？

score 3 · Accepted Answer

问题是 TensorFlow 在您执行以下操作时会返回 EagerTensor：

inputs = encoders["inputs"].encode(input_str) + [1]  # add EOS id
batch_inputs = tf.reshape(inputs, [1, -1, 1])

并且 EagerTensor 无法转换为 JSON。不幸的是，3D numpy 数组也无法转换为 JSON。但是 numpy 数组可以很容易地转换为列表。一个例子：

import json
import numpy as np
import tensorflow as tf

a = np.array([1, 2, 3])
b = np.array([1, 2, 3])
c = tf.multiply(a, b)

print(c)  # -> <tf.Tensor: shape=(3,), dtype=int64, numpy=array([1, 4, 9])>
print(c.numpy())  # -> array([1, 4, 9])
print(c.numpy().tolist())  # -> [1, 4, 9]

with open("example.json", "w") as f:
   json.dump(c, f)  # TypeError: Object of type EagerTensor is not JSON serializable
   json.dump(c.numpy(), f)  # TypeError: Object of type ndarray is not JSON serializable
   json.dump(c.numpy().tolist(), f)  # works!

我无法为您的确切情况提供示例，因为您的代码片段不够完整。但

return {"inputs": batch_inputs.numpy().tolist()}

应该做的工作。

score 1 · Accepted Answer

如果您想将 dict 中的张量数据保存到 JSON 文件中，一个简单的解决方案是递归地进入您的字典并使用正确的函数将您的数据转换为 Json 中可序列化的内容（例如，如果它只是用于保存字符串，则为字符串）。如果这是您真正想要做的（即保存您的数据），我确信 tensorflow 必须有一种方法将您的数据保存为泡菜文件。

以下代码用于将 dict 中的内容递归地转换为字符串，但您应该能够根据您的用例轻松修改和 numify、jsonify 等代码。我的用例是以人类可读的格式保存数据（而不仅仅是torch.save）：

#%%

def _to_json_dict_with_strings(dictionary):
    """
    Convert dict to dict with leafs only being strings. So it recursively makes keys to strings
    if they are not dictionaries.

    Use case:
        - saving dictionary of tensors (convert the tensors to strins!)
        - saving arguments from script (e.g. argparse) for it to be pretty

    e.g.

    """
    if type(dictionary) != dict:
        return str(dictionary)
    d = {k: _to_json_dict_with_strings(v) for k, v in dictionary.items()}
    return d

def to_json(dic):
    import types
    import argparse

    if type(dic) is dict:
        dic = dict(dic)
    else:
        dic = dic.__dict__
    return _to_json_dict_with_strings(dic)

def save_to_json_pretty(dic, path, mode='w', indent=4, sort_keys=True):
    import json

    with open(path, mode) as f:
        json.dump(to_json(dic), f, indent=indent, sort_keys=sort_keys)

def my_pprint(dic):
    """

    @param dic:
    @return:

    Note: this is not the same as pprint.
    """
    import json

    # make all keys strings recursively with their naitve str function
    dic = to_json(dic)
    # pretty print
    pretty_dic = json.dumps(dic, indent=4, sort_keys=True)
    print(pretty_dic)
    # print(json.dumps(dic, indent=4, sort_keys=True))
    # return pretty_dic

import torch
# import json  # results in non serializabe errors for torch.Tensors
from pprint import pprint

dic = {'x': torch.randn(1, 3), 'rec': {'y': torch.randn(1, 3)}}

my_pprint(dic)
pprint(dic)

输出：

{
    "rec": {
        "y": "tensor([[-0.3137,  0.3138,  1.2894]])"
    },
    "x": "tensor([[-1.5909,  0.0516, -1.5445]])"
}
{'rec': {'y': tensor([[-0.3137,  0.3138,  1.2894]])},
 'x': tensor([[-1.5909,  0.0516, -1.5445]])}

python - 翻译模型预测：TypeError：“EagerTensor”类型的对象不是 JSON 可序列化的

2 回答 2

Related

Reference