1

我使用 Flask 构建了一个 API 端点,其中数据是从其他 API 收集和组合的。为了有效地做到这一点,我使用多进程。为了保持控制,我想使用 Google Stackdriver 记录所有步骤。

出于某种原因,在我的多进程环境中使用 Google Stackdriver 时,我不断收到错误消息。我在 MWE 中收到的错误和后来的警告如下:

Pickling client objects is explicitly not supported.
Clients have non-trivial state that is local and unpickleable.
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\...\Anaconda3\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "C:\Users\...\Anaconda3\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

最小的工作示例(为简单起见,不包括 Flask/API):

project_name = project_name = 'budget_service'
message = 'This is a test'
labels = {
    'deployment': 'develop',
    'severity': 'info'
}

# Import libs
from google.cloud import logging
import multiprocessing as mp

# Initialize logging
logging_client = logging.Client()
logger = logging_client.logger(project_name)

# Function to write log
def writeLog(logger):
    logger.log_text(
        text = message,
        labels = labels
    )
    print('logger succeeded')

def testFunction():
    print('test')

# Run without mp
writeLog(logger)

# Run with mp
print(__name__)
if __name__ == '__main__':       
    try:
        print('mp started')

        # Initialize
        manager = mp.Manager()
        return_dict = manager.dict()
        jobs = []

        # Set up workers
        worker_log1 = mp.Process(name='testFunction', target=testFunction, args=[])
        worker_log2 = mp.Process(name='writeLog', target=writeLog, args=[logger])

        # Store in jobs
        jobs.append(worker_log1)
        jobs.append(worker_log2)


        # Start workers
        worker_log1.start()
        worker_log2.start()

        for job in jobs:
            job.join()

        print('mp succeeded')

    except Exception as err:
         print(err)

为什么不能将多处理与 Google Stackdriver 结合起来?我应该调整什么(我对什么理解不好)才能完成这项工作?

4

1 回答 1

1

截至今天(04.2019),stackdriver 日志仍然不支持多处理。解决方案是:

  • 确保您的进程以spawn模式而不是fork(*nix 上的默认值)启动,这会阻止共享任何内容
  • 通过在每个进程中单独配置它们来避免显式共享日志记录对象

对于谷歌库,使用fork多处理通常是一个坏主意,stackdriver 并不是唯一引起问题的。

于 2019-04-27T23:41:58.253 回答