6

我正在使用 Python3 向 Stackdriver 查询 GCP 日志。不幸的是,包含重要数据的日志条目以“NoneType”而不是“dict”或“str”的形式返回给我。结果“entry.payload”是“None”类型,“entry.payload_pb”有我想要的数据,但它是乱码。

有没有办法让 Stackdriver 以干净的格式返回这些数据,或者有什么方法可以解析它?如果没有,有没有办法我应该查询比我正在做的更好并产生干净数据的数据?

我的代码看起来像这样:

#!/usr/bin/python3

from google.cloud.logging import Client, ASCENDING, DESCENDING
from google.oauth2.service_account import Credentials

projectName = 'my_project'
myFilter = 'logName="projects/' + projectName + '/logs/compute.googleapis.com%2Factivity_log"'

client = Client(project = projectName)
entries = client.list_entries(order_by=DESCENDING, page_size = 500, filter_ = myFilter)
for entry in entries:
    if isinstance(entry.payload, dict):
        print(entry.payload)
    if isinstance(entry.payload, str):
        print(entry.payload)
    if isinstance(entry.payload, None):
        print(entry.payload_pb)

“entry.payload_pb”数据总是这样开始:

type_url: "type.googleapis.com/google.cloud.audit.AuditLog"
 value: "\032;\n9gcp-user@my-project.iam.gserviceaccount.com"I\n\r129.105.16.28\0228
4

5 回答 5

1

LogEntry.proto_payload是一个Any 消息,它对其他一些原始缓冲区消息进行编码。proto 消息的类型由 表示type_url,消息体被序列化到value字段中。确定类型后,您可以使用类似的方式对其进行反序列化

from google.cloud.audit import AuditLog
...

audit_log = AuditLog()
audit_log.ParseFromString(entry.payload_pb.value)

AuditLog消息可在https://github.com/googleapis/googleapis/blob/master/google/cloud/audit/audit_log.proto获得,并且可以使用 protoc 编译器构建相应的Python 定义

请注意,AuditLog消息的某些字段也可以包含其他Any消息。https://cloud.google.com/logging/docs/audit/api/有更多详细信息

于 2018-05-18T19:13:02.837 回答
1

In case anyone has the same issue that I had, here's how I solved it:

1) Download and install protobuf. I did this on a mac with brew (brew install protobuf)
2) Download and install grpcio. I used pip install grpcio
3) Download the "Google APIs" to a known directory. I used /tmp, and this command git clone https://github.com/googleapis/googleapis
4) Change directories to the root directory of the repository you just downloaded in Step 3
5) Use protoc to build the python repository. This command worked for me
protoc -I=/tmp/googleapis/ --python_out=/tmp/ /tmp/googleapis/google/cloud/audit/audit_log.proto
6) Your audit_log_pb2.py file should exist in /tmp/audit_log_pb2.py
7) Place this file in the proper path OR in the same directory as your script.
8) Add this line to the imports in your script:
import audit_log_pb2
9) After I did this, the entry.payload portion of the Protobuf entry was consistently populated with dicts.

PLEASE NOTE: You should verify what version of protoc you are using with the following command protoc --version. You really want to use protoc 3.x, because the file we are building from is from version 3 of the spec. The Ubuntu package I installed on a Linux box was version 2, and this was kind of frustrating. Also, although this file was built for Python 2.x, it seems to work fine with Python 3.x.

于 2018-05-25T17:32:29.427 回答
1

实际上我错过了,但是您可以通过将环境变量设置为非空字符串来禁用gRPC并使 API 返回(JSON)有效负载,例如.dictGOOGLE_CLOUD_DISABLE_GRPCGOOGLE_CLOUD_DISABLE_GRPC=true

这将填充payload而不是payload_pb- 比编译可能已过时的原型缓冲区更容易!

于 2018-08-28T00:38:01.080 回答
1

看起来与解析 protobuf 以进行日志记录相关的 python 库中的某些内容已损坏。我发现了两个老问题

  1. https://github.com/GoogleCloudPlatform/google-cloud-python/issues/3218
  2. https://github.com/GoogleCloudPlatform/google-cloud-python/issues/2674

这似乎在不久前得到了解决 - 但我相信问题被重新引入。我已经为这个问题打开了谷歌支持的票,他们正在调查它。

作为解决方法 - 您可以使用两个选项:

  1. 您可以创建导出(接收器)到 BigQuery - 因此在这种情况下您可以轻松查询日志 - 这种方法的问题是它不会导出您在创建导出之前收集的旧数据。
  2. 您可以使用 gcloud 命令。尤其

    gcloud 日志记录读取

它非常强大(支持过滤器、时间戳)——但它的输出格式是 yaml。您可以安装和使用 PyYAML 库将日志转换为字典。

于 2018-05-12T00:43:49.353 回答
0

我遵循@rhinestone-cowguy 的回答,但认为示例用法将帮助找到此答案的人。要使用已编译的(原型)代码:

from google.cloud import logging
import audit_log_pb2

client = logging.Client()
PROJECT_IDS = ["one-project", "another-project"]

for entry in client.list_entries(projects=PROJECT_IDS):  # API call(s)

    # The proto payload is an Any message.
    audit_log = audit_log_pb2.AuditLog()
    entry.payload.Unpack(audit_log)
    print(audit_log)

在Python Generated Code中记录了 Any 消息的使用。

于 2019-08-06T10:54:01.360 回答