15

Waiting for http-01 challenge propagation: failed to perform self check GET request,它类似于这个错误https://github.com/jetstack/cert-manager/issues/656 但来自 GitHub 票证评论的所有解决方案都没有帮助。

CertManager我正在尝试按照本教程中的说明在 DigitalOcean 上进行设置: https ://www.digitalocean.com/community/tutorials/how-to-set-up-an-nginx-ingress-with-cert-manager-on -digitalocean-kubernetes 我没有收到任何错误,但来自的请求CertManager处于挂起状态超过 40 小时。

我已经使用 Nginx 成功配置了 Ingress,然后我创建了一个命名空间并创建了CertManagerCRD:

$ kubectl create namespace cert-manager
$ kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v0.12.0/cert-manager.yaml

我可以按预期看到所有CertManagerpod:

$ kubectl get pods --namespace cert-manager
NAME                                       READY   STATUS    RESTARTS   AGE
cert-manager-5c47f46f57-gxhwv              1/1     Running   0          42h
cert-manager-cainjector-6659d6844d-xp75s   1/1     Running   0          42h
cert-manager-webhook-547567b88f-k4dv2      1/1     Running   0          42h

然后我创建了登台发行人:

---
apiVersion: cert-manager.io/v1alpha2
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging
  namespace: cert-manager
spec:
  acme:
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    email: some@email.here
    privateKeySecretRef:
      name: letsencrypt-staging
    solvers:
      - http01:
          ingress:
            class: nginx

并更新了 Ingress 配置:

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: echo-ingress
  annotations:
    kubernetes.io/ingress.class: "nginx"
    # cert-manager.io/cluster-issuer: "letsencrypt-prod"
    cert-manager.io/cluster-issuer: "letsencrypt-staging"
spec:
  tls:
    - hosts:
        - echo.some.domain
      secretName: ingress-tls
  rules:
    - host: echo.some.domain
      http:
        paths:
          - backend:
              serviceName: echo1
              servicePort: 80

但在那之后,CertManager没有更新证书并处于等待InProgress状态:

$ date
Wed 18 Dec 2019 01:58:08 PM MSK

$ kubectl describe cert

...
Status:
  Conditions:
    Last Transition Time:  2019-12-16T17:23:56Z
    Message:               Waiting for CertificateRequest "ingress-tls-1089568541" to complete
    Reason:                InProgress
    Status:                False
    Type:                  Ready
Events:                    <none>

而不是使用Fake LE Intermediate X1as aCN它返回CN=Kubernetes Ingress Controller Fake Certificate,O=Acme Co

$ kubectl describe CertificateRequest 
Status:
  Conditions:
    Last Transition Time:  2019-12-16T17:50:05Z
    Message:               Waiting on certificate issuance from order default/ingress-tls-1089568541-1576201144: "pending"
    Reason:                Pending
    Status:                False
    Type:                  Ready
Events:                    <none>

可能是什么问题CertManager以及如何解决?


更新:

入口日志包含以下错误:

$ kubectl -n ingress-nginx logs  nginx-ingress-controller-7754db565c-g557h 

I1218 17:24:30.331127       6 status.go:295] updating Ingress default/cm-acme-http-solver-4dkdn status from [] to [{xxx.xxx.xxx.xxx }]
I1218 17:24:30.333250       6 status.go:295] updating Ingress default/cm-acme-http-solver-9dpqc status from [] to [{xxx.xxx.xxx.xxx }]
I1218 17:24:30.341292       6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"cm-acme-http-solver-4dkdn", UID:"2e523b74-8bbb-41c7-be8a-44d8db8abd6e", APIVersion:"extensions/v1beta1", ResourceVersion:"722472", FieldPath:""}): type: 'Normal' reason: 'UPDATE' Ingress default/cm-acme-http-solver-4dkdn
I1218 17:24:30.344340       6 event.go:209] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"cm-acme-http-solver-9dpqc", UID:"b574a3b6-6c5b-4266-a4e2-6ff2de2d78e0", APIVersion:"extensions/v1beta1", ResourceVersion:"722473", FieldPath:""}): type: 'Normal' reason: 'UPDATE' Ingress default/cm-acme-http-solver-9dpqc
W1218 17:24:30.442276       6 controller.go:1042] Error getting SSL certificate "default/ingress-tls": local SSL certificate default/ingress-tls was not found. Using default certificate
W1218 17:24:30.442950       6 controller.go:1042] Error getting SSL certificate "default/ingress-tls": local SSL certificate default/ingress-tls was not found. Using default certificate
W1218 17:24:33.775476       6 controller.go:1042] Error getting SSL certificate "default/ingress-tls": local SSL certificate default/ingress-tls was not found. Using default certificate
W1218 17:24:33.775956       6 controller.go:1042] Error getting SSL certificate "default/ingress-tls": local SSL certificate default/ingress-tls was not found. Using default certificate

更新2:

秘密ingress-tls可按预期使用:

$ kubectl get secret ingress-tls -o yaml

apiVersion: v1
data:
  ca.crt: ""
  tls.crt: ""
  tls.key: <secret-key-data-base64-encoded>
kind: Secret
metadata:
  annotations:
    cert-manager.io/certificate-name: ingress-tls
    cert-manager.io/issuer-kind: ClusterIssuer
    cert-manager.io/issuer-name: letsencrypt-staging
  creationTimestamp: "2019-12-16T17:23:56Z"
  name: ingress-tls
  namespace: default
  resourceVersion: "328801"
  selfLink: /api/v1/namespaces/default/secrets/ingress-tls
  uid: 5d640b66-1572-44a1-94e4-6d85a73bf21c
type: kubernetes.io/tls

更新3:

我发现cert-managerpod 失败并显示日志:

E1219 11:06:08.294011       1 sync.go:184] cert-manager/controller/challenges "msg"="propagation check failed" "error"="failed to perform self check GET request 'http://<some.domain>/.well-known/acme-challenge/<some-path>': Get http://<some.domain>/.well-known/acme-challenge/<some-path>: dial tcp xxx.xxx.xxx.xxx:80: connect: connection timed out" "dnsName"="<some.domain>" "resource_kind"="Challenge" "resource_name"="ingress-tls-1089568541-1576201144-1086699008" "resource_namespace"="default" "type"="http-01" 

挑战状态:

$ kubectl describe challenge ingress-tls-1089568541-1576201144-471532423

Name:         ingress-tls-1089568541-1576201144-471532423
Namespace:    default
Labels:       <none>
Annotations:  <none>
API Version:  acme.cert-manager.io/v1alpha2
Kind:         Challenge
Metadata:
  Creation Timestamp:  2019-12-19T11:32:19Z
  Finalizers:
    finalizer.acme.cert-manager.io
  Generation:  1
  Owner References:
    API Version:           acme.cert-manager.io/v1alpha2
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  Order
    Name:                  ingress-tls-1089568541-1576201144
    UID:                   7d19d86f-0b56-4756-aa20-bb85caf80b9e
  Resource Version:        872062
  Self Link:               /apis/acme.cert-manager.io/v1alpha2/namespaces/default/challenges/ingress-tls-1089568541-1576201144-471532423
  UID:                     503a8b4e-dc60-4080-91d9-2847815af1cc
Spec:
  Authz URL:  https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/123456
  Dns Name:   <domain>
  Issuer Ref:
    Group:  cert-manager.io
    Kind:   ClusterIssuer
    Name:   letsencrypt-staging
  Key:      <key>
  Solver:
    http01:
      Ingress:
        Class:  nginx
  Token:        <token>
  Type:         http-01
  URL:          https://acme-staging-v02.api.letsencrypt.org/acme/chall-v3/12345/abc
  Wildcard:     false
Status:
  Presented:   true
  Processing:  true
  Reason:      Waiting for http-01 challenge propagation: failed to perform self check GET request 'http://<domain>/.well-known/acme-challenge/<token>': Get http://<domain>/.well-known/acme-challenge/<token>: dial tcp xxx.xxx.xxx.xxx:80: connect: connection timed out
  State:       pending
Events:
  Type    Reason     Age    From          Message
  ----    ------     ----   ----          -------
  Normal  Started    4m28s  cert-manager  Challenge scheduled for processing
  Normal  Presented  4m28s  cert-manager  Presented challenge using http-01 challenge mechanism

我试图删除挑战以重新触发它,但在一两分钟后它失败并出现同样的错误。我检查了我是否可以从集群节点访问挑战 URL(使用kubectl run -it ...wget http://<domain>/.well-known/acme-challenge/<token>从新 pod.

4

5 回答 5

5

这可能值得一看。我遇到了类似的问题Connection Timeout

服务的变化LoadBalanceringress-nginx

添加/更改externalTrafficPolicy: Cluster

原因是,带有证书颁发者的 pod 与负载均衡器在不同的节点上结束,因此它无法通过入口与自己对话。

以下是取自https://raw.githubusercontent.com/kubernetes/ingress-nginx/nginx-0.26.1/deploy/static/provider/cloud-generic.yaml的完整块

kind: Service
apiVersion: v1
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
spec:
  #CHANGE/ADD THIS
  externalTrafficPolicy: Cluster
  type: LoadBalancer
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
  ports:
    - name: http
      port: 80
      targetPort: http
    - name: https
      port: 443
      targetPort: https

---
于 2020-05-21T04:42:08.793 回答
3

在我的情况下,cert-manager想要通过内部 IP 地址请求挑战。

未能执行自检 GET 请求“http:///.well-known/acme-challenge/”:获取 http:///.well-known/acme-challenge/:拨打 tcp 10.67.0.8:80:连接:连接超时

即DNS解析被破坏了。我通过更改部署cert-manager来解决这个问题,只接受像这样的外部 DNS 服务器

spec:
  template:
    spec:
      dnsConfig:
        nameservers:
        - 8.8.8.8
      dnsPolicy: None

就是你的做法。还创建了一个问题,因此我们可以通过 helm 安装来更改它

于 2020-05-30T22:21:33.143 回答
2

我遇到了完全相同的问题,它似乎与 Digital Ocean 负载均衡器如何工作的错误有关。该线程让加密证书颁发建议将注释添加service.beta.kubernetes.io/do-loadbalancer-hostname: "kube.mydomain.com"到负载均衡器。在我的情况下,我没有负载均衡器的 yaml 配置文件,我只是从nginx-ingress 安装脚本中复制了负载均衡器声明,并将新配置应用于 kubernetes 集群。下面是负载均衡器的最终配置。

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/do-loadbalancer-enable-proxy-protocol: 'true'
    # See https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/master/docs/controllers/services/examples/README.md#accessing-pods-over-a-managed-load-balancer-from-inside-the-cluster
    service.beta.kubernetes.io/do-loadbalancer-hostname: "kube.mydomain.com"
  labels:
    helm.sh/chart: ingress-nginx-3.19.0
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/version: 0.43.0
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/component: controller
  name: ingress-nginx-controller
  namespace: ingress-nginx
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  ports:
    - name: http
      port: 80
      protocol: TCP
      targetPort: http
    - name: https
      port: 443
      protocol: TCP
      targetPort: https
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/instance: ingress-nginx
    app.kubernetes.io/component: controller
于 2021-01-25T06:14:34.610 回答
0

我的一个 CertManager pod 被冻结,所以我将它们全部删除,然后它们重新启动。证书立即更新。

kubectl get pods -n cert-manager(或您的 pod 所在的任何命名空间)

然后全部删除。

kubectl delete pod -n cert-manager cert-manager-xxxx cert-manager-cainjector-xxxx cert-manager-webhook-xxxx

于 2021-05-04T21:50:41.307 回答
0

我没有找到这个问题的原因,所以我将发布我如何解决它作为答案。看起来这与此错误中的问题相同。我通过cert-manager 完全卸载并重新安装它而不更改任何配置或设置来修复它。

于 2020-05-14T08:54:34.037 回答