我有一个损坏的 kubernetes 集群作为遗产,现在需要以某种方式启动它。我不知道它是如何实际创建的,但假设它是通过 kubeadm 完成的。因此,通过kubeadm alpha certs renew all
更新证书,我使一个节点部分启动,但无法实现进一步的步骤。现在我的 etcd 和 kupe-api 服务器不工作(启动然后停止):netstat
启动后立即输出:
kube-api 和 etcd running。几分钟后,我们在任何端口都没有 etcd 和 api-server 监听:
没有 kube-aip 和 etcd
docker ps-a
表明这两个容器在一分钟内启动和退出:
dicker ps -a
码头工人日志:
2020-11-19 15:10:01.351661 I | etcdmain: etcd Version: 3.3.10
2020-11-19 15:10:01.351731 I | etcdmain: Git SHA: 27fc7e2
2020-11-19 15:10:01.351735 I | etcdmain: Go Version: go1.10.4
2020-11-19 15:10:01.351738 I | etcdmain: Go OS/Arch: linux/amd64
2020-11-19 15:10:01.351742 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2020-11-19 15:10:01.351795 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2020-11-19 15:10:01.351815 I | embed: peerTLS: cert = /etc/kubernetes/pki/etcd/peer.crt, key = /etc/kubernetes/pki/etcd/peer.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file =
2020-11-19 15:10:01.352371 I | embed: listening for peers on https://192.168.1.3:2380
2020-11-19 15:10:01.352404 I | embed: listening for client requests on 127.0.0.1:2379
2020-11-19 15:10:01.352426 I | embed: listening for client requests on 192.168.1.3:2379
2020-11-19 15:10:01.360515 W | snap: skipped unexpected non snapshot file tmp527778953
2020-11-19 15:10:01.362871 I | etcdserver: recovered store from snapshot at index 104718355
2020-11-19 15:10:01.364214 I | mvcc: restore compact to 87110378
2020-11-19 15:10:01.376240 I | etcdserver: name = k8s-master-01
2020-11-19 15:10:01.376263 I | etcdserver: data dir = /var/lib/etcd
2020-11-19 15:10:01.376269 I | etcdserver: member dir = /var/lib/etcd/member
2020-11-19 15:10:01.376272 I | etcdserver: heartbeat = 100ms
2020-11-19 15:10:01.376274 I | etcdserver: election = 1000ms
2020-11-19 15:10:01.376277 I | etcdserver: snapshot count = 10000
2020-11-19 15:10:01.376286 I | etcdserver: advertise client URLs = https://192.168.1.3:2379
2020-11-19 15:10:01.448684 I | etcdserver: restarting member 361c924cbd55a81 in cluster 7e3c896b15fbe02d at commit index 104724993
2020-11-19 15:10:01.448967 I | raft: 361c924cbd55a81 became follower at term 74097
2020-11-19 15:10:01.449001 I | raft: newRaft 361c924cbd55a81 [peers: [361c924cbd55a81,dad85d000dfebf92,e6cf4fe3e32b8396], term: 74097, commit: 104724993, applied: 104718355, lastindex: 104724995, lastterm: 45466]
2020-11-19 15:10:01.449122 I | etcdserver/api: enabled capabilities for version 3.3
2020-11-19 15:10:01.449139 I | etcdserver/membership: added member 361c924cbd55a81 [https://192.168.1.3:2380] to cluster 7e3c896b15fbe02d from store
2020-11-19 15:10:01.449143 I | etcdserver/membership: added member dad85d000dfebf92 [https://192.168.1.4:2380] to cluster 7e3c896b15fbe02d from store
2020-11-19 15:10:01.449146 I | etcdserver/membership: added member e6cf4fe3e32b8396 [https://192.168.1.5:2380] to cluster 7e3c896b15fbe02d from store
2020-11-19 15:10:01.449150 I | etcdserver/membership: set the cluster version to 3.3 from store
2020-11-19 15:10:01.450999 I | mvcc: restore compact to 87110378
2020-11-19 15:10:01.462560 W | auth: simple token is not cryptographically signed
2020-11-19 15:10:01.465418 I | rafthttp: starting peer dad85d000dfebf92...
2020-11-19 15:10:01.465478 I | rafthttp: started HTTP pipelining with peer dad85d000dfebf92
2020-11-19 15:10:01.465710 I | rafthttp: started streaming with peer dad85d000dfebf92 (writer)
2020-11-19 15:10:01.465804 I | rafthttp: started streaming with peer dad85d000dfebf92 (writer)
2020-11-19 15:10:01.465993 I | rafthttp: started peer dad85d000dfebf92
2020-11-19 15:10:01.466018 I | rafthttp: started streaming with peer dad85d000dfebf92 (stream Message reader)
2020-11-19 15:10:01.466035 I | rafthttp: added peer dad85d000dfebf92
2020-11-19 15:10:01.466050 I | rafthttp: starting peer e6cf4fe3e32b8396...
2020-11-19 15:10:01.466058 I | rafthttp: started HTTP pipelining with peer e6cf4fe3e32b8396
2020-11-19 15:10:01.466067 I | rafthttp: started streaming with peer dad85d000dfebf92 (stream MsgApp v2 reader)
2020-11-19 15:10:01.466308 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (writer)
2020-11-19 15:10:01.466483 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (writer)
2020-11-19 15:10:01.466637 I | rafthttp: started peer e6cf4fe3e32b8396
2020-11-19 15:10:01.466650 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (stream Message reader)
2020-11-19 15:10:01.466656 I | rafthttp: added peer e6cf4fe3e32b8396
2020-11-19 15:10:01.466669 I | etcdserver: starting server... [version: 3.3.10, cluster version: 3.3]
2020-11-19 15:10:01.466892 I | rafthttp: started streaming with peer e6cf4fe3e32b8396 (stream MsgApp v2 reader)
2020-11-19 15:10:01.469431 I | embed: ClientTLS: cert = /etc/kubernetes/pki/etcd/server.crt, key = /etc/kubernetes/pki/etcd/server.key, ca = , trusted-ca = /etc/kubernetes/pki/etcd/ca.crt, client-cert-auth = true, crl-file =
2020-11-19 15:10:02.949569 I | raft: 361c924cbd55a81 is starting a new election at term 74097
2020-11-19 15:10:02.949607 I | raft: 361c924cbd55a81 became candidate at term 74098
2020-11-19 15:10:02.949631 I | raft: 361c924cbd55a81 received MsgVoteResp from 361c924cbd55a81 at term 74098
2020-11-19 15:10:02.949640 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to dad85d000dfebf92 at term 74098
2020-11-19 15:10:02.949647 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to e6cf4fe3e32b8396 at term 74098
2020-11-19 15:10:03.949545 I | raft: 361c924cbd55a81 is starting a new election at term 74098
2020-11-19 15:10:03.949590 I | raft: 361c924cbd55a81 became candidate at term 74099
2020-11-19 15:10:03.949622 I | raft: 361c924cbd55a81 received MsgVoteResp from 361c924cbd55a81 at term 74099
2020-11-19 15:10:03.949631 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to dad85d000dfebf92 at term 74099
2020-11-19 15:10:03.949641 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to e6cf4fe3e32b8396 at term 74099
2020-11-19 15:10:05.749518 I | raft: 361c924cbd55a81 is starting a new election at term 74099
2020-11-19 15:10:05.749560 I | raft: 361c924cbd55a81 became candidate at term 74100
2020-11-19 15:10:05.749570 I | raft: 361c924cbd55a81 received MsgVoteResp from 361c924cbd55a81 at term 74100
2020-11-19 15:10:05.749579 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to dad85d000dfebf92 at term 74100
2020-11-19 15:10:05.749585 I | raft: 361c924cbd55a81 [logterm: 45466, index: 104724995] sent MsgVote request to e6cf4fe3e32b8396 at term 74100
2020-11-19 15:10:06.466249 W | rafthttp: health check for peer dad85d000dfebf92 could not connect: dial tcp 192.168.1.4:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
2020-11-19 15:10:06.466315 W | rafthttp: health check for peer dad85d000dfebf92 could not connect: dial tcp 192.168.1.4:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2020-11-19 15:10:06.467084 W | rafthttp: health check for peer e6cf4fe3e32b8396 could not connect: dial tcp 192.168.1.5:2380: connect: connection refused (prober "ROUND_TRIPPER_RAFT_MESSAGE")
2020-11-19 15:10:06.467113 W | rafthttp: health check for peer e6cf4fe3e32b8396 could not connect: dial tcp 192.168.1.5:2380: connect: connection refused (prober "ROUND_TRIPPER_SNAPSHOT")
api服务器
Flag --insecure-port has been deprecated, This flag will be removed in a future version.
I1119 15:50:49.995997 1 server.go:560] external host was not specified, using 192.168.1.3
I1119 15:50:49.996210 1 server.go:147] Version: v1.15.2
I1119 15:50:50.410114 1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,MutatingAdmissionWebhook.
I1119 15:50:50.410138 1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
E1119 15:50:50.410654 1 prometheus.go:55] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410689 1 prometheus.go:68] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410709 1 prometheus.go:82] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410727 1 prometheus.go:96] failed to register workDuration metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410753 1 prometheus.go:112] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410778 1 prometheus.go:126] failed to register unfinished metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410804 1 prometheus.go:152] failed to register depth metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410825 1 prometheus.go:164] failed to register adds metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410866 1 prometheus.go:176] failed to register latency metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410900 1 prometheus.go:188] failed to register work_duration metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410924 1 prometheus.go:203] failed to register unfinished_work_seconds metric admission_quota_controller: duplicate metrics collector registration attempted
E1119 15:50:50.410934 1 prometheus.go:216] failed to register longest_running_processor_microseconds metric admission_quota_controller: duplicate metrics collector registration attempted
I1119 15:50:50.410952 1 plugins.go:158] Loaded 10 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,MutatingAdmissionWebhook.
I1119 15:50:50.410962 1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
I1119 15:50:50.412410 1 client.go:354] parsed scheme: ""
I1119 15:50:50.412424 1 client.go:354] scheme "" not registered, fallback to default scheme
I1119 15:50:50.412471 1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0 <nil>}]
I1119 15:50:50.412508 1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W1119 15:50:50.412748 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
I1119 15:50:51.407920 1 client.go:354] parsed scheme: ""
I1119 15:50:51.407942 1 client.go:354] scheme "" not registered, fallback to default scheme
I1119 15:50:51.407974 1 asm_amd64.s:1337] ccResolverWrapper: sending new addresses to cc: [{127.0.0.1:2379 0 <nil>}]
I1119 15:50:51.408003 1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: [{127.0.0.1:2379 <nil>}]
W1119 15:50:51.408241 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:51.412833 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:52.408397 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:52.987076 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:54.057294 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:55.505474 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:50:56.454869 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:00.038449 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:00.372682 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:06.682062 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
W1119 15:51:07.463691 1 clientconn.go:1251] grpc: addrConn.createTransport failed to connect to {127.0.0.1:2379 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:2379: connect: connection refused". Reconnecting...
I1119 15:51:10.412593 1 asm_amd64.s:1337] balancerWrapper: got update addr from Notify: []
F1119 15:51:10.412600 1 storage_decorator.go:57] Unable to create storage backend: config (&{ /registry {[https://127.0.0.1:2379] /etc/kubernetes/pki/apiserver-etcd-client.key /etc/kubernetes/pki/apiserver-etcd-client.crt /etc/kubernetes/pki/etcd/ca.crt} true 0xc0005b90e0 apiextensions.k8s.io/v1beta1 <nil> 5m0s 1m0s}), err (dial tcp 127.0.0.1:2379: connect: connection refused)
W1119 15:51:10.412741 1 asm_amd64.s:1337] Failed to dial 127.0.0.1:2379: context canceled; please retry.
systemctl status kubelet(它正在运行)
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.461018 1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.561183 1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.653030 1074 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/config/apiserver.go:47: Failed to list *v1.Pod: Get https://192.168.1.3:6443/api/v1/pods?fieldSelector=spec.nodeName%3Dk8s-master-01&limit=500&resourceVersion=0: dial tcp 192.168.1.3:6443: connect: connection refused
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.661422 1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.761566 1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.853243 1074 reflector.go:125] k8s.io/client-go/informers/factory.go:133: Failed to list *v1beta1.CSIDriver: Get https://192.168.1.3:6443/apis/storage.k8s.io/v1beta1/csidrivers?limit=500&resourceVersion=0: dial tcp 192.168.1.3:6443: connect: connection refused
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.861820 1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:39 k8s-master-01 kubelet[1074]: E1119 18:49:39.962034 1074 kubelet.go:2248] node "k8s-master-01" not found
Nov 19 18:49:40 k8s-master-01 kubelet[1074]: E1119 18:49:40.053584 1074 reflector.go:125] k8s.io/kubernetes/pkg/kubelet/kubelet.go:444: Failed to list *v1.Service: Get https://192.168.1.3:6443/api/v1/services?limit=500&resourceVersion=0: dial tcp 192.168.1.3:6443: connect: connection refused
此外,我认为创建新应用程序并在那里转移应用程序的选项似乎越来越有吸引力。我在docker image
. 创建新集群并将它们转移到那里会很痛苦吗?
在哪里可以找到 POD 配置?我没有注意到任何 *.yaml 描述了如何处理所有这些图像。
提前致谢。