python - 捕获 structlog 中的所有 stdout/stderr 以生成 JSON 日志

Question

我目前正试图摆脱 print() 并开始使用 ELK 堆栈和 structlog 模块集中日志收集来生成结构化的 json 日志行。这对于我使用 loggingHelper 模块自己编写的模块来说非常好，我可以导入和使用

logger = Logger()

在其他模块和脚本中。这是 loggingHelper 模块类：

class Logger:
    """
    Wrapper Class to import within other modules and scripts
    All the config and log binding (script
    """
    def __init__(self):
        self.__log = None
        logging.basicConfig(level=logging.DEBUG, format='%(message)s')
        structlog.configure(logger_factory=LoggerFactory(),
                            processors=[structlog.stdlib.add_log_level,
                            structlog.processors.TimeStamper(fmt="iso"),
                            structlog.processors.JSONRenderer()])
        logger = structlog.get_logger()
        main_script = os.path.basename(sys.argv[0]) if sys.argv[0] else None
        frame = inspect.stack()[1]
        log_invocation = os.path.basename(frame[0].f_code.co_filename)
        user = getpass.getuser()

        """
        Who executed the __main__, what was the executed __main__ file, 
        where did the log event happen?
        """
        self.__log = logger.bind(executedScript = main_script,
                                 logBirth = log_invocation,
                                 executingUser = user)

    def info(self, msg, **kwargs):
        self.__log.info(msg, **kwargs)

    def debug(self, msg, **kwargs):
        self.__log.debug(msg, **kwargs)

    def error(self, msg, **kwargs):
        self.__log.error(msg, **kwargs)

    def warn(self, msg, **kwargs):
        self.__log.warning(msg, **kwargs)

这会产生格式良好的输出（每行一个 JSON），filebeat 能够读取并转发到 Elasticsearch。但是，第三方库完全粉碎了格式良好的日志。

{"executingUser": "xyz", "logBirth": "efood.py", "executedScript": "logAlot.py", "context": "SELECT displayname FROM point_of_sale WHERE name = '123'", "level": "debug", "timestamp": "2019-03-15T12:52:42.792398Z", "message": "querying local"}
{"executingUser": "xyz", "logBirth": "efood.py", "executedScript": "logAlot.py", "level": "debug", "timestamp": "2019-03-15T12:52:42.807922Z", "message": "query successful: got 0 rows"}
building service object
auth version used is: v4
Traceback (most recent call last):
  File "logAlot.py", line 26, in <module>
    ef.EfoodDataControllerMerchantCenter().get_displayname(123)
  File "/home/xyz/src/toolkit/commons/connectors/efood.py", line 1126, in get_displayname
    return efc.select_from_local(q)['displayname'].values[0]
IndexError: index 0 is out of bounds for axis 0 with size 0

正如您所看到的，来自第三方 Librara（googleapiclient）的信息级别和错误级别消息都是在不经过日志处理器的情况下打印的。

使用我编写的 loggingHelper 模块捕获和格式化在执行一个脚本中发生的所有事情的最佳方式（也是最 Pythonic）是什么？这甚至是最佳实践吗？

编辑：目前记录器确实写入标准输出本身，然后使用>>和2>＆1将其重定向到crontab中的文件。如果我想将通过第三方库日志记录写入 stdout/stderr 的所有内容重定向到我，这对我来说似乎是一种不好的做法，因为这会导致循环，对吗？因此，我的目标不是重定向，而是捕获日志处理器中的所有内容。相应地更改了标题。

此外，这是我想要实现的目标的粗略概述。我非常愿意接受与此不同的一般批评和建议。

score 1 · Accepted Answer

首先第一件事：你不应该在你的类初始化器中做任何记录器配置（logging.basicConfig等logging.dictConfig） - 日志记录配置应该在进程启动时只做一次。该logging模块的重点是完全解耦日志调用

第二点：我不是structlog专家（这是轻描淡写的 - 这实际上是我第一次听说这个包）但是你得到的结果是你的代码片段所期望的：只有你自己的代码使用structlog，所有其他的libs（stdlib 或第 3 部分）仍将使用stdlib记录器并发出纯文本日志。

从我在structlogdoc 中看到的内容来看，它似乎提供了一些方法来使用包装stdlib 的记录器structlog.stdlib.LoggerFactory并添加特定的格式化程序以获得更一致的输出。我还没有测试过这个（还），官方文档有点稀疏并且缺乏可用的实际例子（至少我找不到任何例子）但是这篇文章似乎有一个更明确的例子（适应你自己的上下文和当然是用例）。

警告：正如我所说我从未使用过structlog（我第一次听说这个库）所以我可能误解了一些东西，你当然必须尝试找出如何正确配置整个东西以使其按预期工作.

附带说明：在类 unix 系统stdout中应该用于程序的输出（我的意思是“预期输出”=> 程序的实际结果），而所有错误/报告/调试消息都属于stderr. 除非您有令人信服的理由不这样做，否则您应该尝试遵守此约定（至少对于命令行工具，以便您可以以 unix 方式链接/管道它们）。

score 1 · Accepted Answer

配置`logging`模块

正如您已经知道的那样，structlog需要配置 python 中已经存在的日志记录功能。

http://www.structlog.org/en/stable/standard-library.html

logging.basicConfig支持stream和filename这里的选项

https://docs.python.org/3/library/logging.html#logging.basicConfig。

您可以指定一个文件名，记录器将创建一个句柄并引导其所有输出。根据您的设置方式，这可能是您通常重定向到的文件

import logging

logging.basicConfig(level=logging.DEBUG, format='%(message)s', filename='output.txt')

或者您可以将 StringIO 对象传递给构建器，稍后您可以从中读取，然后重定向到您希望的输出目标

import logging
import io

stream = io.StringIO()

logging.basicConfig(level=logging.DEBUG, format='%(message)s', stream=stream)

可以在此处阅读有关 StringIO 的更多信息

https://docs.python.org/3/library/io.html#io.TextIOBase

正如@bruno在他的回答中指出的那样，不要这样做，__init__因为您最终可能会在同一进程中多次调用这段代码。

python - 捕获 structlog 中的所有 stdout/stderr 以生成 JSON 日志

2 回答 2

配置logging模块

Related

Reference

配置`logging`模块