azure - 使用 DC/OS 编排器在 Azure 容器服务上的 Zeppelin 中没有为 PySpark 作业分配 CPU

Question

我在 Azure 容器服务上创建了一个具有 DC/OS 编排器和以下特征的集群。

   {
    "agentPoolProfiles": [
      {
        "count": 2,
        "dnsPrefix": "XXXagents",
        "fqdn": "XXXagents.westeurope.cloudapp.azure.com",
        "name": "agentpools",
        "vmSize": "Standard_D2_v2"
      }
    ],
    "customProfile": null,
    "diagnosticsProfile": {
      "vmDiagnostics": {
        "enabled": true,
        "storageUri": "https://5ygr5x3zcagpkdiag0.blob.core.windows.net/"
      }
    },
    "id": "/subscriptions/092d03c2-6d61-451b-9970-9edaaec1e45c/resourceGroups/XXX/providers/Microsoft.ContainerService/containerServices/findmecontainer",
    "linuxProfile": {
      "adminUsername": "stijn",
      "ssh": {
        "publicKeys": [
          {
            "keyData": "XXX"
          }
        ]
      }
    },
    "location": "westeurope",
    "masterProfile": {
      "count": 1,
      "dnsPrefix": "XXXmgmt",
      "fqdn": "XXXmgmt.westeurope.cloudapp.azure.com"
    },
    "name": "XXX",
    "orchestratorProfile": {
      "orchestratorType": "DCOS"
    },
    "provisioningState": "Succeeded",
    "resourceGroup": "XXX",
    "servicePrincipalProfile": null,
    "tags": null,
    "type": "Microsoft.ContainerService/ContainerServices",
    "windowsProfile": null
  }

我已经按照https://github.com/jshenguru/dcos-zeppelin安装了 zeppelin 0.7.O 和 Spark 2.1.0 。我在哪里设置以下值选项文件（zeppelin-0.7.0.json）：

    {
  "volumes": null,
  "id": "/zeppelin",
  "cmd": "sed \"s#<value>8080</value>#<value>$PORT0</value>#\" < conf/zeppelin-site.xml.template > conf/zeppelin-site.xml && sed -i \"s#<value>-1</value>#<value>$PORT1</value>#\" conf/zeppelin-site.xml && SPARK_HOME_TGZ=$(ls ${MESOS_SANDBOX}/spark-*.tgz) SPARK_HOME=${SPARK_HOME_TGZ%.tgz} bin/zeppelin.sh start",
  "args": null,
  "user": null,
  "env": {
    "SPARK_MESOS_EXECUTOR_DOCKER_IMAGE": "mesosphere/spark:1.0.7-2.1.0-hadoop-2.7",
    "SPARK_CORES_MAX": "16",
    "SPARK_EXECUTOR_MEMORY": "20g",
    "ZEPPELIN_JAVA_OPTS": "-Dspark.mesos.coarse=true -Dspark.mesos.executor.home=/opt/spark/dist",
    "ZEPPELIN_INTP_JAVA_OPTS": "-Dspark.mesos.coarse=true -Dspark.mesos.executor.home=/opt/spark/dist"
  },
  "instances": 2,
  "cpus": 2,
  "mem": 4000,
  "disk": 30000,
  "gpus": 0,
  "executor": null,
  "constraints": null,
  "fetch": [
    {
      "uri": "https://downloads.mesosphere.io/spark/assets/spark-2.1.0-bin-2.7.tgz"
    }
  ],
  "storeUrls": null,
  "backoffSeconds": 1,
  "backoffFactor": 1.15,
  "maxLaunchDelaySeconds": 3600,
  "container": {
    "docker": {
      "image": "jshenguru/dcos-zeppelin:0.7.0",
      "forcePullImage": false,
      "privileged": false,
      "network": "HOST"
    }
  },
...
}

当我在 zeppelin 中运行笔记本时，几乎没有任何 CPU 用于执行 Py(Spark) 作业（参见附录中的图像）

.

谁能解释这个问题可能是什么？

azure - 使用 DC/OS 编排器在 Azure 容器服务上的 Zeppelin 中没有为 PySpark 作业分配 CPU

0 回答 0

Related

Reference