概括
我正在尝试了解 Prefect docker 代理。为此,我试图在我的本地机器上配置一个最小设置。我已经设法让 docker 代理连接到本地服务器,看起来它正在运行流程。但是,似乎在流程完成后,代理无法在服务器上更新流程的状态,因为它无法连接到服务器后端。
细节
这是我的流程:
import prefect
from prefect import task, Flow
from prefect.run_configs import DockerRun
@task
def say_hello():
logger = prefect.context.get("logger")
logger.info("Hello, docker!")
with Flow("docker-hello-flow") as flow:
flow.run_config = DockerRun()
say_hello()
# Register the flow under the "tutorial" project
flow.register(project_name="tutorial")
我将后端配置为使用本地核心服务器:
prefect backend server
然后我启动服务器:
prefect server start -d
我连接到服务器 UIlocalhost:8080
并确认它正在运行。
在 UI 中,我创建了项目tutorial
。
然后我注册流程:
:; python src/hello_docker.py
Flow URL: http://localhost:8080/default/flow/fea8211e-c243-40c8-a01e-f63ab2afcc77
└── ID: 0a7a6cc4-1e7b-4e71-a900-90dffb4362a9
└── Project: tutorial
└── Labels: ['parami']
然后,我确认流程按预期显示在 UI 中。请注意,我的机器名称parami
因此是 label parami
。
然后我启动一个指定标签的本地 docker 代理parami
。
:; prefect agent docker start -l parami --log-level DEBUG --show-flow-logs
然后我通过 UI 运行流程。流程运行称为enigmatic-axolotl
。
docker代理的日志如下:
[2021-11-16 22:47:17,076] DEBUG - agent | No ready flow runs found.
[2021-11-16 22:47:17,076] DEBUG - agent | Sleeping flow run poller for 2.0 seconds...
[2021-11-16 22:47:18,506] DEBUG - agent | {'status': 'Pulling from prefecthq/prefect', 'id': '0.15.7'}
[2021-11-16 22:47:18,509] DEBUG - agent | {'status': 'Digest: sha256:e3f6dece4c8d5d7b289cb6e017d3a3b0617d3084df57c2bb96999a2b3c2470f0'}
[2021-11-16 22:47:18,510] DEBUG - agent | {'status': 'Status: Image is up to date for prefecthq/prefect:0.15.7'}
[2021-11-16 22:47:18,513] INFO - agent | Successfully pulled image prefecthq/prefect:0.15.7
[2021-11-16 22:47:18,513] DEBUG - agent | Creating Docker container prefecthq/prefect:0.15.7
[2021-11-16 22:47:18,578] DEBUG - agent | Starting Docker container with ID 68fea87be44f0332568b2a01e391c5e1822f58f43438df9bb81e228a8edc9625 and name 'enigmatic-axolotl'
[2021-11-16 22:47:18,976] DEBUG - agent | Docker container 68fea87be44f0332568b2a01e391c5e1822f58f43438df9bb81e228a8edc9625 started
[2021-11-16 22:47:18,977] INFO - agent | Completed deployment of flow run b1df1042-96a7-4a09-aee0-820468eccf87
[2021-11-16 22:47:19,076] DEBUG - agent | Querying for ready flow runs...
[2021-11-16 22:47:19,101] DEBUG - agent | No ready flow runs found.
[2021-11-16 22:47:19,101] DEBUG - agent | Sleeping flow run poller for 4.0 seconds...
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 175, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 96, in create_connection
raise err
File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 86, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 706, in urlopen
chunked=chunked,
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 394, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 239, in request
super(HTTPConnection, self).request(method, url, body=body, headers=headers)
File "/usr/local/lib/python3.7/http/client.py", line 1281, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1327, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1276, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/local/lib/python3.7/http/client.py", line 1036, in _send_output
self.send(msg)
File "/usr/local/lib/python3.7/http/client.py", line 976, in send
self.connect()
File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 205, in connect
conn = self._new_conn()
File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 187, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7f2e7172e690>: Failed to establish a new connection: [Errno 111] Connection refused
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 756, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 574, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='host.docker.internal', port=4200): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2e7172e690>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/prefect", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/prefect/cli/execute.py", line 53, in flow_run
result = client.graphql(query)
File "/usr/local/lib/python3.7/site-packages/prefect/client/client.py", line 554, in graphql
retry_on_api_error=retry_on_api_error,
File "/usr/local/lib/python3.7/site-packages/prefect/client/client.py", line 458, in post
retry_on_api_error=retry_on_api_error,
File "/usr/local/lib/python3.7/site-packages/prefect/client/client.py", line 738, in _request
session=session, method=method, url=url, params=params, headers=headers
File "/usr/local/lib/python3.7/site-packages/prefect/client/client.py", line 606, in _send_request
timeout=prefect.context.config.cloud.request_timeout,
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 590, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 542, in request
resp = self.send(prep, **send_kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 655, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 516, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='host.docker.internal', port=4200): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f2e7172e690>: Failed to establish a new connection: [Errno 111] Connection refused'))
[2021-11-16 22:47:23,101] DEBUG - agent | Querying for ready flow runs...
[2021-11-16 22:47:23,181] DEBUG - agent | No ready flow runs found.
[2021-11-16 22:47:23,181] DEBUG - agent | Sleeping flow run poller for 8.0 seconds...
因此,代理成功地让流enigmatic-axolotl
从服务器运行,并且似乎完成了执行。我的理解是,它随后尝试连接到服务器以更新流运行的状态。但是,它没有这样做,因为它无法连接到host.docker.internal:4200
.
我想知道是否host.docker.internal
是一个有效的主机,所以我用选项重新启动了代理-a http://localhost:4200
。代理成功连接到服务器localhost:4200
(在日志中报告这样做)但是,当我再次运行流程时,我得到与以前相同的错误;也就是说,它无法连接到host.docker.internal:4200
.
最后,我用-a http://0.0.0.0:4200
. 再次,它成功连接到服务器。然后我重新运行流程,它再次失败。但是,这一次它试图连接到0.0.0.0:4200
:
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='0.0.0.0', port=4200): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f70738948d0>: Failed to establish a new connection: [Errno 111] Connection refused'))
我错过了什么?我假设我需要设置一些配置才能完成这项工作。