我和我的团队正在 GCP 中建立管道,我们正在尝试通过运行笔记本教程https://www.tensorflow.org/tfx/tutorials/tfx/cloud-ai-platform-pipelines来学习。但是,当我们进入创建管道的步骤时,会出现此错误。请帮忙!
我们跑:
!tfx pipeline create \
--pipeline-path=kubeflow_dag_runner.py \
--endpoint={ENDPOINT} \
--build-target-image={CUSTOM_TFX_IMAGE}
我们得到了:
`2021-02-09 08:21:49.170213: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH:`
/usr/local/cuda/lib64:/usr/local/nccl2/lib:/usr/local/cuda/extras/CUPTI/lib64
2021-02-09 08:21:49.170263: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
CLI
Creating pipeline
Detected Kubeflow.
Use --engine flag if you intend to use a different orchestrator.
Reading build spec from build.yaml
Target image gcr.io/ts-ntnu-v2021-stm-ep3j/tfx-pipeline is not used. If the build spec is provided, update the target image in the build spec file build.yaml.
[Skaffold] Generating tags...
[Skaffold] - gcr.io/ts-ntnu-v2021-stm-ep3j/tfx-pipeline -> gcr.io/ts-ntnu-v2021-stm-ep3j/tfx-pipeline:latest
[Skaffold] Checking cache...
[Skaffold] - gcr.io/ts-ntnu-v2021-stm-ep3j/tfx-pipeline: Not found. Building
[Skaffold] Building [gcr.io/ts-ntnu-v2021-stm-ep3j/tfx-pipeline]...
[Skaffold] Sending build context to Docker daemon 2.056MB
[Skaffold] Step 1/4 : FROM tensorflow/tfx:0.26.1
[Skaffold] ---> 6dd91a0791af
[Skaffold] Step 2/4 : WORKDIR /pipeline
[Skaffold] ---> Using cache
[Skaffold] ---> 7882f4facc06
[Skaffold] Step 3/4 : COPY ./ ./
[Skaffold] ---> 2dbfe44eb3f1
[Skaffold] Step 4/4 : ENV PYTHONPATH="/pipeline:${PYTHONPATH}"
[Skaffold] ---> Running in b6bbdb97a2df
[Skaffold] Removing intermediate container b6bbdb97a2df
[Skaffold] ---> d7d56f13fe6d
[Skaffold] Successfully built d7d56f13fe6d
[Skaffold] Successfully tagged gcr.io/ts-ntnu-v2021-stm-ep3j/tfx-pipeline:latest
[Skaffold] The push refers to repository [gcr.io/ts-ntnu-v2021-stm-ep3j/tfx-pipeline]
[Skaffold] 06e11ce4eea3: Preparing
[Skaffold] ab1902317977: Preparing
[Skaffold] 1a67ae26cf47: Preparing
[Skaffold] 25e69afdb83b: Preparing
[Skaffold] 2bd41d6594e3: Preparing
[Skaffold] 8e486d328b86: Preparing
[Skaffold] 8f42d0a1a747: Preparing
[Skaffold] 4058ae03fa32: Preparing
[Skaffold] e3437c61d457: Preparing
[Skaffold] 84ff92691f90: Preparing
[Skaffold] 54b00d861a7a: Preparing
[Skaffold] c547358928ab: Preparing
[Skaffold] 84ff92691f90: Preparing
[Skaffold] c4e66be694ce: Preparing
[Skaffold] 47cc65c6dd57: Preparing
[Skaffold] 8e486d328b86: Waiting
[Skaffold] 8f42d0a1a747: Waiting
[Skaffold] 4058ae03fa32: Waiting
[Skaffold] e3437c61d457: Waiting
[Skaffold] 84ff92691f90: Waiting
[Skaffold] 54b00d861a7a: Waiting
[Skaffold] c547358928ab: Waiting
[Skaffold] 47cc65c6dd57: Waiting
[Skaffold] c4e66be694ce: Waiting
[Skaffold] Build Failed. No push access to specified image repository. Trying running with `--default-repo` flag.
No container image is built.
Traceback (most recent call last):
File "/opt/conda/bin/tfx", line 10, in <module>
sys.exit(cli_group())
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/decorators.py", line 73, in new_func
return ctx.invoke(f, obj, *args, **kwargs)
File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/jupyter/.local/lib/python3.7/site-packages/tfx/tools/cli/commands/pipeline.py", line 117, in create_pipeline
handler_factory.create_handler(ctx.flags_dict).create_pipeline()
File "/home/jupyter/.local/lib/python3.7/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 75, in create_pipeline
skaffold_cmd)
File "/home/jupyter/.local/lib/python3.7/site-packages/tfx/tools/cli/handler/kubeflow_handler.py", line 291, in _build_pipeline_image
skaffold_cmd=skaffold_cmd).build()
File "/home/jupyter/.local/lib/python3.7/site-packages/tfx/tools/cli/container_builder/builder.py", line 92, in build
image_sha = skaffold_cli.build(self._buildspec)
File "/home/jupyter/.local/lib/python3.7/site-packages/tfx/tools/cli/container_builder/skaffold_cli.py", line 61, in build
spec.filename))
RuntimeError: skaffold failed to build an image with build.yaml.
- - - - 更新 - - - - -
如果这有帮助,我会在我的日志中找到: