python - 如何 JSON 序列化来自谷歌自然语言 API 的对象？（没有 dict 属性）

Question

我正在将 Google Natural Language API 用于带有情感分析的项目标记文本。我想将我的 NL 结果存储为 JSON。如果直接向 Google 发出 HTTP 请求，则会返回 JSON 响应。

但是，当使用提供的 Python 库时，会返回一个对象，并且该对象不能直接 JSON 序列化。

这是我的代码示例：

import os
import sys
import oauth2client.client
from google.cloud.gapic.language.v1beta2 import enums, language_service_client
from google.cloud.proto.language.v1beta2 import language_service_pb2

class LanguageReader:
    # class that parses, stores and reports language data from text

    def __init__(self, content=None):

        try:
            # attempts to autheticate credentials from env variable
            oauth2client.client.GoogleCredentials.get_application_default()
        except oauth2client.client.ApplicationDefaultCredentialsError:
            print("=== ERROR: Google credentials could not be authenticated! ===")
            print("Current enviroment variable for this process is: {}".format(os.environ['GOOGLE_APPLICATION_CREDENTIALS']))
            print("Run:")
            print("   $ export GOOGLE_APPLICATION_CREDENTIALS=/YOUR_PATH_HERE/YOUR_JSON_KEY_HERE.json")
            print("to set the authentication credentials manually")
            sys.exit()

        self.language_client = language_service_client.LanguageServiceClient()
        self.document = language_service_pb2.Document()
        self.document.type = enums.Document.Type.PLAIN_TEXT
        self.encoding = enums.EncodingType.UTF32

        self.results = None

        if content is not None:
                self.read_content(content)

    def read_content(self, content):
        self.document.content = content
        self.language_client.analyze_sentiment(self.document, self.encoding)
        self.results = self.language_client.analyze_sentiment(self.document, self.encoding)

现在，如果您要运行：

sample_text="I love R&B music. Marvin Gaye is the best. 'What's Going On' is one of my favorite songs. It was so sad when Marvin Gaye died."
resp = LanguageReader(sample_text).results
print resp

你会得到：

document_sentiment {
  magnitude: 2.40000009537
  score: 0.40000000596
}
language: "en"
sentences {
  text {
    content: "I love R&B music."
  }
  sentiment {
    magnitude: 0.800000011921
    score: 0.800000011921
  }
}
sentences {
  text {
    content: "Marvin Gaye is the best."
    begin_offset: 18
  }
  sentiment {
    magnitude: 0.800000011921
    score: 0.800000011921
  }
}
sentences {
  text {
    content: "\'What\'s Going On\' is one of my favorite songs."
    begin_offset: 43
  }
  sentiment {
    magnitude: 0.40000000596
    score: 0.40000000596
  }
}
sentences {
  text {
    content: "It was so sad when Marvin Gaye died."
    begin_offset: 90
  }
  sentiment {
    magnitude: 0.20000000298
    score: -0.20000000298
  }
}

这不是 JSON。它是 google.cloud.proto.language.v1beta2.language_service_pb2.AnalyzeSentimentResponse 对象的一个实例。并且它没有 __dict__ 属性属性，因此不能使用 json.dumps() 进行序列化。

如何指定响应应为 JSON 或将对象序列化为 JSON？

score 6 · Accepted Answer

编辑：@Zach 注意到 Google 的protobuf Data Interchange Format。似乎首选的选择是使用这些protobuf.json_format方法：

from google.protobuf.json_format import MessageToDict, MessageToJson 

self.dict = MessageToDict(self.results)
self.json = MessageToJson(self.results)

从文档字符串：

MessageToJson(message, including_default_value_fields=False, preserving_proto_field_name=False)
    Converts protobuf message to JSON format.

    Args:
      message: The protocol buffers message instance to serialize.
      including_default_value_fields: If True, singular primitive fields,
          repeated fields, and map fields will always be serialized.  If
          False, only serialize non-empty fields.  Singular message fields
          and oneof fields are not affected by this option.
      preserving_proto_field_name: If True, use the original proto field
          names as defined in the .proto file. If False, convert the field
          names to lowerCamelCase.

    Returns:
      A string containing the JSON formatted protocol buffer message.

python - 如何 JSON 序列化来自谷歌自然语言 API 的对象？（没有 __dict__ 属性）

1 回答 1

Related

Reference

python - 如何 JSON 序列化来自谷歌自然语言 API 的对象？（没有 dict 属性）