我正在通过 Docker 在本地 Ubuntu(可信)上运行 Kubernetes 集群。

由于我使用 Vagrant 创建 Ubuntu VM,我不得不稍微修改docker run官方 Kubernetes 指南中的命令:

docker run -d \
    --volume=/:/rootfs:ro \
    --volume=/sys:/sys:ro \
    --volume=/var/lib/docker/:/var/lib/docker:rw \
    --volume=/var/lib/kubelet/:/var/lib/kubelet:rw \
    --volume=/var/run:/var/run:rw \
    --net=host \
    --privileged=true \
    --pid=host \
    gcr.io/google_containers/hyperkube:v1.3.0 \
    /hyperkube kubelet \
        --allow-privileged=true \
        --api-servers=http://localhost:8080 \
        --v=2 \
        --address= \
        --enable-server \
        --hostname-override= \
        --config=/etc/kubernetes/manifests-multi \
        --containerized \
        --cluster-dns= \

此外,运行反向代理允许我从 VM 外部通过浏览器访问集群的服务:

docker run -d --net=host --privileged gcr.io/google_containers/hyperkube:v1.3.0 \
/hyperkube proxy --master= --v=2

这些步骤运行良好,最终我可以在浏览器中访问 Kubernetes UI。

vagrant@trusty-vm:~$ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

现在,我想在带有 InfluxDB 后端和 Grafana UI 的 Kubernetes 集群中运行 Heapster,正如本指南中所述。为此,我克隆了 Heapster 存储库并grafana-service.yaml通过添加以下内容配置为使用外部 IP type: NodePort

apiVersion: v1
kind: Service
    kubernetes.io/cluster-service: 'true'
    kubernetes.io/name: monitoring-grafana
  name: monitoring-grafana
  namespace: kube-system
  # In a production setup, we recommend accessing Grafana through an external Loadbalancer
  # or through a public IP. 
  type: NodePort
  - port: 80
    targetPort: 3000
    name: influxGrafana

创建服务、rcs 等:

vagrant@trusty-vm:~/heapster$ kubectl create -f deploy/kube-config/influxdb/
You have exposed your service on an external port on all nodes in your
cluster.  If you want to expose this service to the external internet, you may
need to set up firewall rules for the service port(s) (tcp:30593) to serve traffic.

See http://releases.k8s.io/release-1.3/docs/user-guide/services-firewalls.md for more details.
service "monitoring-grafana" created
replicationcontroller "heapster" created
service "heapster" created
replicationcontroller "influxdb-grafana" created
service "monitoring-influxdb" created

vagrant@trusty-vm:~/heapster$ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
Heapster is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/heapster
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard
monitoring-grafana is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana

vagrant@trusty-vm:~/heapster$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                READY     STATUS              RESTARTS   AGE
kube-system   heapster-y2yci                      1/1       Running             0          32m
kube-system   influxdb-grafana-6udas              2/2       Running             0          32m
kube-system   k8s-master-            4/4       Running             0          58m
kube-system   k8s-proxy-             1/1       Running             0          58m
kube-system   kube-addon-manager-    2/2       Running             0          57m
kube-system   kube-dns-v17-y4cwh                  3/3       Running             0          58m
kube-system   kubernetes-dashboard-v1.1.0-bnbnp   1/1       Running             0          58m

vagrant@trusty-vm:~/heapster$ kubectl get svc --all-namespaces
NAMESPACE     NAME                   CLUSTER-IP   EXTERNAL-IP   PORT(S)             AGE
default       kubernetes        <none>        443/TCP             18m
kube-system   heapster        <none>        80/TCP              3s
kube-system   kube-dns         <none>        53/UDP,53/TCP       18m
kube-system   kubernetes-dashboard    <none>        80/TCP              18m
kube-system   monitoring-grafana   <nodes>       80/TCP              3s
kube-system   monitoring-influxdb   <none>        8083/TCP,8086/TCP   16m

如您所见,一切似乎运行顺利,我还可以通过浏览器在http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/访问 Grafana 的 UI 。

但是,大约 1 分钟后,Heapster 和 Grafana 端点都从kubectl cluster-info中消失了。

vagrant@trusty-vm:~/heapster$ kubectl cluster-info
Kubernetes master is running at http://localhost:8080
KubeDNS is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kube-dns
kubernetes-dashboard is running at http://localhost:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard


  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "endpoints \"monitoring-grafana\" not found",
  "reason": "NotFound",
  "details": {
    "name": "monitoring-grafana",
    "kind": "endpoints"
  "code": 404

Pod 仍在运行中...

vagrant@trusty-vm:~/heapster$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                READY     STATUS              RESTARTS   AGE
kube-system   heapster-y2yci                      1/1       Running             0          32m
kube-system   influxdb-grafana-6udas              2/2       Running             0          32m
kube-system   k8s-master-            4/4       Running             0          58m
kube-system   k8s-proxy-             1/1       Running             0          58m
kube-system   kube-addon-manager-    2/2       Running             0          57m
kube-system   kube-dns-v17-y4cwh                  3/3       Running             0          58m
kube-system   kubernetes-dashboard-v1.1.0-bnbnp   1/1       Running             0          58m

...但 Heapster 和 Grafana 服务已经消失:

vagrant@trusty-vm:~/heapster$ kubectl get svc --all-namespaces
NAMESPACE     NAME                   CLUSTER-IP   EXTERNAL-IP   PORT(S)             AGE
default       kubernetes        <none>        443/TCP             19m
kube-system   kube-dns         <none>        53/UDP,53/TCP       19m
kube-system   kubernetes-dashboard    <none>        80/TCP              19m
kube-system   monitoring-influxdb   <none>        8083/TCP,8086/TCP   17m

在检查kubectl cluster-info dump我的输出时发现了以下错误:

I0713 09:31:09.088567       1 proxier.go:427] Adding new service "kube-system/monitoring-grafana:" at
E0713 09:31:09.273385       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
I0713 09:31:09.395280       1 proxier.go:427] Adding new service "kube-system/heapster:" at
E0713 09:31:09.466306       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
I0713 09:31:09.480468       1 proxier.go:502] Setting endpoints for "kube-system/monitoring-grafana:" to []
E0713 09:31:09.519698       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
I0713 09:31:09.532026       1 proxier.go:502] Setting endpoints for "kube-system/heapster:" to []
E0713 09:31:09.558527       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
E0713 09:31:17.249001       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:31:22.252280       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:31:27.257895       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:31:31.126035       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
E0713 09:31:32.264430       1 server.go:294] Starting health server failed: E0709 09:32:01.153168       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" ($
E0713 09:31:37.265109       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:31:42.269035       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:31:47.270950       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:31:52.272354       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:31:57.273424       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
E0713 09:32:01.153168       1 proxier.go:887] can't open "nodePort for kube-system/monitoring-grafana:" (:30593/tcp), skipping this nodePort: listen tcp :30593: bind: address alread$
E0713 09:32:02.276318       1 server.go:294] Starting health server failed: listen tcp bind: address already in use
I0713 09:32:06.105878       1 proxier.go:447] Removing service "kube-system/monitoring-grafana:"
I0713 09:32:07.175025       1 proxier.go:447] Removing service "kube-system/heapster:"
I0713 09:32:07.210270       1 proxier.go:517] Removing endpoints for "kube-system/monitoring-grafana:"
I0713 09:32:07.249824       1 proxier.go:517] Removing endpoints for "kube-system/heapster:"

显然,Heapster 和 Grafana 的服务和端点由于 nodePort已经在使用中而被删除。我没有指定nodePortin grafana-service.yaml,这意味着 Kubernetes 可以选择一个尚未使用的 - 那么这怎么可能是错误呢?另外,有没有办法解决这个问题?

操作系统:Ubuntu 14.04.4 LTS(值得信赖)| Kubernetes:v1.3.0 | 码头工人:v1.11.2


在 grafana-service.yaml 文件(可能还有 heapster-service.yaml 文件)中,您有以下行:kubernetes.io/cluster-service: 'true'. 这个标签意味着这个服务将由插件管理器管理。当插件管理器运行它的定期检查时,它将看到其中没有定义 grafana/heapster 服务/etc/kubernetes/addons并将删除这些服务。


  1. 将标签更改为kubernetes.io/cluster-service: 'false'
  2. 将控制器和服务 yaml 文件移动到/etc/kubernetes/addons主节点上(或配置插件管理器以查找 yaml 文件的任何位置)。


于 2016-10-04T15:40:15.320 回答

我们环境中的同样问题。K8S 版本 = 1.3.4,Docker 1.12,Heapster 是 master 分支的最新版本

于 2016-08-09T12:26:00.330 回答