0

设想

我创建了一个具有多个节点的池(基本映像是 Ubuntu Server 16.04),并提供以下启动命令: /bin/bash -c 'set -o pipefail; export DEBIAN_FRONTEND=noninteractive ; sudo -E apt update ; sudo -E apt upgrade -y ; sudo -E apt-get install -y --no-install-recommends apt-transport-https curl software-properties-common ; curl -fsSL "https://sks-keyservers.net/pks/lookup?op=get&search=0xee6d536cf7dc86e2d7d56f59a178ac6c6238f52e" | sudo -E apt-key add - ; sudo -E apt-add-repository "deb https://packages.docker.com/1.13/apt/repo/ ubuntu-$(lsb_release -cs) main" ; sudo -E apt-get update ; sudo -E apt-get install -y docker-engine ; sudo usermod -a -G docker $USER ; sudo -E service docker start ; journalctl -xe; wait'

该命令服务于安装 Docker 引擎的唯一目的。另请注意,我删除了该选项set -e,以便能够运行命令journalctl -xe并捕获以下错误。

错误

在创建上述池时,某些节点会导致启动任务失败。该行为似乎是随机的,因为并非总是一个节点失败,并且如前所述,其他节点也不会失败。该行为不取决于节点的大小(我尝试了 D2_v3 和 NC6)。

这是的输出journalctl -xe

Oct 12 09:19:40 7d8bb094c57c400582f6031d59f1630000000A systemd[1]: Listening on Docker Socket for the API.
-- Subject: Unit docker.socket has finished start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.socket has finished starting up.
-- 
-- The start-up result is done.
Oct 12 09:19:40 7d8bb094c57c400582f6031d59f1630000000A systemd[1]: Starting Docker Application Container Engine...
-- Subject: Unit docker.service has begun start-up
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.service has begun starting up.
Oct 12 09:19:40 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:40.605332263Z" level=info msg="libcontainerd: new containerd process, pid: 24492"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.608293321Z" level=info msg="[graphdriver] using prior storage driver: aufs"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.626089049Z" level=info msg="Graph migration to content-addressability took 0.00 seconds"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.626378756Z" level=warning msg="Your kernel does not support swap memory limit"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.626558660Z" level=warning msg="Your kernel does not support cgroup rt period"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.626698864Z" level=warning msg="Your kernel does not support cgroup rt runtime"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.626834867Z" level=warning msg="Your kernel does not support cgroup blkio weight"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.626970070Z" level=warning msg="Your kernel does not support cgroup blkio weight_device"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.627384080Z" level=info msg="Loading containers: start."
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.630900065Z" level=info msg="Firewalld running: false"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.661877309Z" level=info msg="Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address"
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A kernel: IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready
Oct 12 09:19:41 7d8bb094c57c400582f6031d59f1630000000A dockerd[24473]: time="2017-10-12T09:19:41.996853856Z" level=info msg="Loading containers: done."
Oct 12 09:19:42 7d8bb094c57c400582f6031d59f1630000000A kernel: aufs au_opts_verify:1585:dockerd[24490]: dirperm1 breaks the protection by the permission bits on the lower branch
Oct 12 09:19:45 7d8bb094c57c400582f6031d59f1630000000A systemd[1]: docker.service: Main process exited, code=killed, status=11/SEGV
Oct 12 09:19:45 7d8bb094c57c400582f6031d59f1630000000A systemd[1]: Failed to start Docker Application Container Engine.
-- Subject: Unit docker.service has failed
-- Defined-By: systemd
-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit docker.service has failed.
-- 
-- The result is failed.
Oct 12 09:19:45 7d8bb094c57c400582f6031d59f1630000000A systemd[1]: docker.service: Unit entered failed state.
Oct 12 09:19:45 7d8bb094c57c400582f6031d59f1630000000A systemd[1]: docker.service: Failed with result 'signal'.

在创建网络接口时似乎出了点问题,但我不确定是什么,尤其是如何修复它。

4

1 回答 1

0

更新答案,2017-10-18:

该问题已在latestCanonical UbuntuServer 16.04-LTS 的平台映像中得到解决,并且可以再次与 Go/Docker 一起使用。

原答案:

你的代码没有问题。Canonical UbuntuServer 16.04-LTS平台映像(此时也是)和 Go/Docker存在问题。201709190latest

修复问题时,将要部署的映像版本设置为201708151临时。

顺便说一句:如果您使用的是 Docker 和 Azure Batch,您应该查看提供此功能的Batch Shipyard 。(完全披露:我是此代码的贡献者。)

于 2017-10-12T15:16:33.987 回答