0

我在 Pachyderm 中的管道有一个 JSON 配置:

{
    "pipeline": {
        "name": "mopng-beneficiary-v2"
    },
    "input": {
        "pfs": {
            "repo": "mopng_beneficiary_v2",
            "glob": "/*"
        }
    },
    "transform": {
        "cmd": ["python3", "/pclean_phlc9h6grzqdhm6sc0zrxjne_UdOgg.py /pfs/mopng_beneficiary_v2/euoEQHIwIQTe1wXtg46fFYok.csv /pfs/mopng_beneficiary_v2//Users/aviralsrivastava/Downloads/5Feb18_master_ujjwala_latlong_dist_dno_so_v7.csv /pfs/mopng_beneficiary_v2//Users/aviralsrivastava/Downloads/ppac_master_v3_mmi_enriched_with_sanity_check.csv /pfs/mopng_beneficiary_v2/Qc.csv"],
        "image": "mopng-beneficiary-v2-image"
    }
}

我的docker文件如下:

FROM ubuntu:14.04

# Install opencv and matplotlib.
RUN apt-get update \
    && apt-get upgrade -y \
    && apt-get install -y unzip wget build-essential \
        cmake git pkg-config libswscale-dev \
        python3-dev python3-numpy python3-tk \
        libtbb2 libtbb-dev libjpeg-dev \
        libpng-dev libtiff-dev libjasper-dev \
        bpython python3-pip libfreetype6-dev \
    && apt-get clean \
    && rm -rf /var/lib/apt

RUN sudo pip3 install matplotlib
RUN sudo pip3 install pandas

# Add our own code.
ADD pclean.py /pclean.py

但是,当我运行命令来创建管道时:

pachctl create-pipeline -f https://raw.githubusercontent.com/avisrivastava254084/learning-pachyderm/master/pipeline.json

这些文件存在于 pfs 中:

pachctl put-file mopng_beneficiary_v2 master -f /Users/aviralsrivastava/Downloads/pclean_phlc9h6grzqdhm6sc0zrxjne_UdOgg.py
➜  ~ pachctl put-file mopng_beneficiary_v2 master -f /Users/aviralsrivastava/Downloads/5Feb18_master_ujjwala_latlong_dist_dno_so_v7.csv
➜  ~ pachctl put-file mopng_beneficiary_v2 master -f /Users/aviralsrivastava/Downloads/ppac_master_v3_mmi_enriched_with_sanity_check.csv
➜  ~ pachctl put-file mopng_beneficiary_v2 master -f /Users/aviralsrivastava/Downloads/euoEQHIwIQTe1wXtg46fFYok.csv

值得注意的是,我是从日志命令(pachctl get-logs --pipeline=mopng-beneficiary-v2)中得到的:

container "user" in pod "pipeline-mopng-beneficiary-v2-v1-lnbjh" is waiting to start: trying and failing to pull image
4

1 回答 1

0

正如 Matthew L Daniel 评论的那样,图像名称看起来很有趣,因为它没有前缀。默认情况下,Pachyderm 从 Dockerhub 拉取 Docker 镜像,并且 Dockerhub 会在镜像前加上拥有它们的用户(例如maths/mopng-beneficiary-v2-image

另外,我认为您可能需要更改输入存储库的名称,使其与管道名称更加不同。Pachyderm 规范化存储库名称以满足 Kubernetes 命名要求,mopng-beneficiary-v2并且mopng_beneficiary_v2可能会规范化为相同的存储库名称(您可能会收到类似的错误repo already exists)。尝试将输入仓库重命名为mopng_beneficiary_input或类似

于 2019-09-25T15:23:57.743 回答