我在我的 AWS CoreOS 实例中将 DataDog 代理作为容器运行。这是通过将 dd-agent 作为容器运行来完成的。为了实现自动化,我编写了一个 systemd 单元,用于在 AWS CoreOS 实例中启用和运行数据狗代理。但是没有任何指标被发送到 DataDog 端。但是 Docker 容器运行没有任何问题。
这是我的 Systemd 单元文件
[Unit]
Description=Sample Datadog Agent
After=docker.service
Requires=docker.service
[Service]
TimeoutStartSec=0
Restart=on-failure
Environment=API_KEY={my-api-key}
Environment=ENV=sample_env
ExecStartPre=-/usr/bin/docker kill datadog
ExecStartPre=-/usr/bin/docker rm -f datadog
ExecStartPre=-/usr/bin/docker pull datadog/docker-dd-agent:11.2.583
ExecStart=/usr/bin/docker run --name datadog \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-v /proc/:/host/proc/:ro \
-v /cgroup/:/host/sys/fs/cgroup:ro \
-e API_KEY=$API_KEY \
-e TAGS=$ENV \
datadog/docker-dd-agent:11.2.583
ExecStop=/usr/bin/docker stop datadog
[Install]
WantedBy=multi-user.target
编辑 - 添加更多信息
最初,当我在单个 CoreOS 实例上运行它时,我能够在 DataDog 仪表板中看到与 docker 相关的实例指标。然后我在多个 CoreOS AWS 实例上启用了它。从那时起,与 CoreOS 实例或 Docker 容器相关的所有指标均不可见。
编辑 - 添加泊坞窗日志
2017-09-14 07:48:47,497 CRIT Supervisor running as root (no user in config file)
2017-09-14 07:48:47,528 INFO RPC interface 'supervisor' initialized
2017-09-14 07:48:47,528 CRIT Server 'unix_http_server' running without any HTTP authentication checking
2017-09-14 07:48:47,528 INFO supervisord started with pid 1
2017-09-14 07:48:48,530 INFO spawned: 'dogstatsd' with pid 11
2017-09-14 07:48:48,531 INFO spawned: 'go-metro' with pid 12
2017-09-14 07:48:48,532 INFO spawned: 'forwarder' with pid 13
2017-09-14 07:48:48,533 INFO spawned: 'collector' with pid 14
2017-09-14 07:48:48,539 INFO spawned: 'jmxfetch' with pid 15
2017-09-14 07:48:50,810 INFO success: go-metro entered RUNNING state, process has stayed up for > than 2 seconds (startsecs)
2017-09-14 07:48:51,811 INFO success: jmxfetch entered RUNNING state, process has stayed up for > than 3 seconds (startsecs)
2017-09-14 07:48:53,419 INFO exited: jmxfetch (exit status 0; expected)
2017-09-14 07:48:53,780 INFO success: dogstatsd entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2017-09-14 07:48:53,780 INFO success: forwarder entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2017-09-14 07:48:53,780 INFO success: collector entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2017-09-14 07:48:53,780 INFO exited: go-metro (exit status 0; expected)