为什么我Linkerd 2.x
在 GKE 的私有集群上安装时会出现以下错误?
Error: could not get apiVersions from Kubernetes: unable to retrieve the complete list of server APIs: tap.linkerd.io/v1alpha1: the server is currently unable to handle the request
为什么我Linkerd 2.x
在 GKE 的私有集群上安装时会出现以下错误?
Error: could not get apiVersions from Kubernetes: unable to retrieve the complete list of server APIs: tap.linkerd.io/v1alpha1: the server is currently unable to handle the request
GKE 上私有集群的默认防火墙规则只允许端口443
和10250
. 这允许分别与kube-apiserver
和进行通信。kubelet
Linkerd
使用端口8443
并8089
在控制和部署到数据平面的代理之间进行通信。
tap 组件使用端口8089
来处理对其apiserver
.
代理注入器和服务配置文件验证器组件都是准入控制器的类型,它们使用端口8443
来处理请求。
Linkerd 2 文档包括在 GKE 私有集群上配置防火墙的说明:https : //linkerd.io/2/reference/cluster-configuration/
它们包括在下面:
获取集群名称:
CLUSTER_NAME=your-cluster-name
gcloud config set compute/zone your-zone-or-region
获取集群 MASTER_IPV4_CIDR:
MASTER_IPV4_CIDR=$(gcloud container clusters describe $CLUSTER_NAME \
| grep "masterIpv4CidrBlock: " \
| awk '{print $2}')
获取集群网络:
NETWORK=$(gcloud container clusters describe $CLUSTER_NAME \
| grep "^network: " \
| awk '{print $2}')
获取集群自动生成的 NETWORK_TARGET_TAG:
NETWORK_TARGET_TAG=$(gcloud compute firewall-rules list \
--filter network=$NETWORK --format json \
| jq ".[] | select(.name | contains(\"$CLUSTER_NAME\"))" \
| jq -r '.targetTags[0]' | head -1)
验证值:
echo $MASTER_IPV4_CIDR $NETWORK $NETWORK_TARGET_TAG
# example output
10.0.0.0/28 foo-network gke-foo-cluster-c1ecba83-node
为代理注入器创建防火墙规则并点击:
gcloud compute firewall-rules create gke-to-linkerd-control-plane \
--network "$NETWORK" \
--allow "tcp:8443,tcp:8089" \
--source-ranges "$MASTER_IPV4_CIDR" \
--target-tags "$NETWORK_TARGET_TAG" \
--priority 1000 \
--description "Allow traffic on ports 8843, 8089 for linkerd control-plane components"
最后,验证防火墙是否已创建:
gcloud compute firewall-rules describe gke-to-linkerd-control-plane
Solution:
The steps I followed are:
kubectl get apiservices
: If linkered apiservice is down with the error CrashLoopBackOff try to follow the step 2 otherwise just try to restart the linkered service using kubectl delete apiservice/"service_name". For me it was v1alpha1.tap.linkerd.io.
kubectl get pods -n kube-system
and found out that pods like metrics-server, linkered, kubernetes-dashboard are down because of the main coreDNS pod was down.
For me it was:
NAME READY STATUS RESTARTS AGE
pod/coredns-85577b65b-zj2x2 0/1 CrashLoopBackOff 7 13m
/etc/coredns/Corefile:10 - Error during parsing: Unknown directive proxy
, then we need to use forward instead of proxy in the yaml file where coreDNS config is there. Because CoreDNS version 1.5x used by the image does not support the proxy keyword anymore.在我的情况下,它与linkerd/linkerd2#3497相关,当 Linkerd 服务存在一些内部问题并且无法响应 API 服务请求时。通过重新启动其 pod 来修复。