2

嘿,我正在开发一个包,它会生成用于训练 GPT-2 的 TFX 管道(请参阅https://github.com/steven-mi/tfx-gpt2)。

我想知道如何将我的管道部署到本地的 Kubeflow。有没有这样做的深入指南?

4

1 回答 1

2

几个月前我正在研究这个,但被其他东西拉断了。我使用下面的配方(不是一个脚本)让 KFP、TFX 和 JupyterLab 在 Google Cloud VM 上运行,并且 IIRC 我能够部署 TFX 管道并运行它。我将 microk8s 用于 Kubernetes 集群。所以正在进行中,但对于它的价值,也许它会有所帮助:

sudo apt-get remove docker docker-engine docker.io containerd runc
sudo apt-get update
sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io
sudo groupadd docker
sudo usermod -aG docker ${USER}

# K8s 1.14 is currently recommended for KFP
sudo snap install microk8s --channel=1.14 --classic
sudo snap alias microk8s.kubectl kubectl
sudo usermod -a -G microk8s $USER

(exit and log back in)

docker run -d -p 5000:5000 --restart=always --name registry registry:2

microk8s.enable dns dashboard storage
microk8s.enable kubeflow
export PIPELINE_VERSION=0.2.5
kubectl apply -k github.com/kubeflow/pipelines/manifests/kustomize/base/crds?ref=$PIPELINE_VERSION
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
kubectl apply -k github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION

sudo apt-get install python3-pip
sudo update-alternatives --install /usr/bin/python python /usr/bin/python3.6 1
sudo update-alternatives  --set python /usr/bin/python3.6
sudo update-alternatives --install /usr/bin/pip pip /usr/bin/pip3 1
sudo update-alternatives  --set pip /usr/bin/pip3
pip install --upgrade pip

export PATH=$PATH:~/.local/bin
pip install notebook
pip install jupyterlab

<Make public IP address static>

jupyter notebook --generate-config
Set a password (Optional):
python
from notebook.auth import passwd; passwd()
(remember the password, and save the generated password)

vi ~/.jupyter/jupyter_notebook_config.py
Enable:
    c.NotebookApp.ip = '*'
    c.NotebookApp.open_browser = False
    c.NotebookApp.port = 3389 # for Pantheon (normally 8888)
    c.NotebookApp.password = 'sha:generated password above'

pip install --no-cache-dir --upgrade tfx
git clone https://github.com/tensorflow/tfx.git
mkdir AIHub
cp tfx/docs/tutorials/tfx/template.ipynb AIHub
cd AIHub

(wait about 5-15 minutes)
kubectl describe configmap inverse-proxy-config -n kubeflow | grep googleusercontent.com
jupyter lab &
于 2020-08-20T21:45:29.253 回答