python - 完成得太快时，Ansible Runner 连续调用会混乱

Question

我使用官方 Ansible Runner 库制作了一个软件，该库接收多个远程调用来运行 1 或 N 次 1 或 M 剧本...... Ansible 运行配置是顺序的，尽管这与不同的调用无关（如果我理解正确，它只是在同一个剧本运行中配置任务）

所以，我使用Ansible Runnerrun_async()运行剧本：

runner_async_thread, runner_object = ansible_runner.run_async(
                **{k: v for k, v in kwargs.items() if v is not None})

并保持一个循环异步线程的is_alive()方法，检查其他条件

while runner_async_thread.is_alive():
    ...

如果引发异常，或者线程完成后，我只检查状态结果并返回。

问题是，当系统同时收到大量呼叫时，它会出现混乱，并且我会收到如下错误：

The offending line appears to be:


{"username": "operator", "password": "!", "target": "sever_003_linux"}05_linux"}
                                                                      ^ here
We could be wrong, but this one looks like it might be an issue with
unbalanced quotes. If starting a value with a quote, make sure the
line ends with the same set of quotes. For instance this arbitrary
example:

    foo: "bad" "wolf"

Could be written as:

    foo: '"bad" "wolf"'

错误显然是这样的：

    {"username": "new_user", "target": "sever_003_linux"}05_linux"}

我做检查（日志和 env/extravars 文件），但发送的命令是正确的：

{"username": "new_user", "target": "sever_003_linux"}

所以，似乎一个内存区域在没有被清理的情况下被覆盖，可能是 2 个跑步者一起运行（似乎有可能）没有线程安全？请问您对如何解决这个问题或防止它发生的方法有一些想法吗？

代码正常工作，使用一些延迟时相同的调用工作，但我不认为这是一个理想的解决方案......

我在玩 Ansible 配置，但没办法。

ansible 2.9.6
python version = 3.8.10 (default, Jun  2 2021, 10:49:15) [GCC 9.4.0]

score 0 · Accepted Answer

我发现更多人在这个 Jira 故事中报告了这个问题：https ://jira.opencord.org/browse/CORD-922

Ansible 通过其 API 使用时，不是线程安全的。

他们还提出了一个关于如何克服这个问题的想法：

为了安全并避免此类问题，我们将通过在使用前调用 fork() 将 Ansible 的调用包装在进程中。

但是，就我而言，我必须返回操作结果来报告它。因此，我声明了一个共享队列以便与进程进行通信，并派生出主队列。

import ansible_runner
from multiprocessing import Queue
import os

#...

def playbook_run(self, parameters):
    #...
    runner_async_thread, runner_object = ansible_runner.run_async(
                    **{k: v for k, v in kwargs.items() if v is not None})
    while runner_async_thread.is_alive():
        #...
    return run_result


shared_queue = Queue()
process_pid = os.fork()
if process_pid == 0:  # the forked child process will independently run & report
    run_result = self.playbook_run(playbook_name,
                                   parameters)
    shared_queue.put(run_result)
    shared_queue.close()
    shared_queue.join_thread()
    os._exit(0)
else:  # the parent process will wait until it gets the report
    run_result = shared_queue.get()
    return run_result

而且，假设缺乏线程安全是问题所在，问题就解决了。

由于我认为没有报告，我在 Ansible Runner 开发者 GitHub 中打开了一个问题：https ://github.com/ansible/ansible-runner/issues/808

python - 完成得太快时，Ansible Runner 连续调用会混乱

1 回答 1

Related

Reference