0

我尝试运行 kubeflow 示例(管道/pytorchjob 等),但它堆叠为ContainerCreating.

为此,我想查看 dockershim 和 docker 日志。有文件吗?

示例代码如下 https://github.com/kubeflow/pipelines/tree/master/samples/core/helloworld

kubectl create serviceaccount pipeline-runner
python3 hello_world.py
kubectl create -f hello_world.py.yaml
$ kubectl describe workflow.argoproj.io/my-first-pipeline-wgkg2
Name:         my-first-pipeline-wgkg2
Namespace:    default
Labels:       workflows.argoproj.io/phase=Running
Annotations:  pipelines.kubeflow.org/pipeline_spec: {"description": "A hello world pipeline.", "name": "My first pipeline"}
API Version:  argoproj.io/v1alpha1
Kind:         Workflow
Metadata:
  Creation Timestamp:  2020-04-22T00:11:41Z
  Generate Name:       my-first-pipeline-
  Generation:          3
  Resource Version:    23748
  Self Link:           /apis/argoproj.io/v1alpha1/namespaces/default/workflows/my-first-pipeline-wgkg2
  UID:                 50acafe4-2254-4dea-865d-7ec03496e523
Spec:
  Arguments:
  Entrypoint:            my-first-pipeline
  Service Account Name:  pipeline-runner
  Templates:
    Container:
      Args:
        echo "hello world"
      Command:
        sh
        -c
      Image:  library/bash:4.4.23
      Name:
      Resources:
    Inputs:
    Metadata:
    Name:  echo
    Outputs:
    Dag:
      Tasks:
        Arguments:
        Name:      echo
        Template:  echo
    Inputs:
    Metadata:
    Name:  my-first-pipeline
    Outputs:
Status:
  Finished At:  <nil>
  Nodes:
    my-first-pipeline-wgkg2:
      Children:
        my-first-pipeline-wgkg2-3423630397
      Display Name:   my-first-pipeline-wgkg2
      Finished At:    <nil>
      Id:             my-first-pipeline-wgkg2
      Name:           my-first-pipeline-wgkg2
      Phase:          Running
      Started At:     2020-04-22T00:11:41Z
      Template Name:  my-first-pipeline
      Type:           DAG
    my-first-pipeline-wgkg2-3423630397:
      Boundary ID:    my-first-pipeline-wgkg2
      Display Name:   echo
      Finished At:    <nil>
      Id:             my-first-pipeline-wgkg2-3423630397
      Message:        ContainerCreating
      Name:           my-first-pipeline-wgkg2.echo
      Phase:          Pending
      Started At:     2020-04-22T00:11:41Z
      Template Name:  echo
      Type:           Pod
  Phase:              Running
  Started At:         2020-04-22T00:11:41Z
Events:               <none>

kubectl logs如下

$ kubectl logs my-first-pipeline-wgkg2-3423630397 -c wait
Error from server (BadRequest): container "wait" in pod "my-first-pipeline-wgkg2-3423630397" is waiting to start: ContainerCreating
$ kubectl logs my-first-pipeline-wgkg2-3423630397 -c main
Error from server (BadRequest): container "main" in pod "my-first-pipeline-wgkg2-3423630397" is waiting to start: ContainerCreating
4

2 回答 2

2

在工作节点中之后,您可以docker ps查找containerid容器,然后查看日志docker logs containerid命令应该很有用。

还要查看工作节点中的 kubelet 日志,通过在工作节​​点中运行命令来查看是否有任何问题journalctl -u kubelet.service -f

于 2020-04-22T02:49:51.183 回答
0

通常,您寻求的信息位于“kubectl describe事件”部分。但是在您的日志中,它似乎是空的。也许您的 pod 还没有安排好?

于 2020-07-11T06:00:08.993 回答