kubernetes - 为什么 kubernetes 报告“就绪探测失败”和“活性探测失败”

Question

我的应用程序有一个有效的 Kubernetes 部署。

---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  ...
  template:
    ...
    spec:
      containers:
      - name: my-app
        image: my-image
        ...
        readinessProbe:
          httpGet:
            port: 3000
            path: /
        livenessProbe:
          httpGet:
            port: 3000
            path: /

当我应用我的部署时，我可以看到它运行正常并且应用程序响应了我的请求。

$ kubectl describe pod -l app=my-app

...
Events:
  Type    Reason     Age   From                                  Message
  ----    ------     ----  ----                                  -------
  Normal  Scheduled  4m7s  default-scheduler                     Successfully assigned XXX
  Normal  Pulled     4m5s  kubelet, pool-standard-4gb-2cpu-b9vc  Container image "my-app" already present on machine
  Normal  Created    4m5s  kubelet, pool-standard-4gb-2cpu-b9vc  Created container my-app
  Normal  Started    4m5s  kubelet, pool-standard-4gb-2cpu-b9vc  Started container my-app

应用程序有缺陷并在某些情况下崩溃。我“调用”了这样一个条件，然后在 pod 事件中看到以下内容：

$ kubectl describe pod -l app=my-app

...
Events:
  Type     Reason     Age               From                                  Message
  ----     ------     ----              ----                                  -------
  Normal   Scheduled  6m45s             default-scheduler                     Successfully assigned XXX
  Normal   Pulled     6m43s             kubelet, pool-standard-4gb-2cpu-b9vc  Container image "my-app" already present on machine
  Normal   Created    6m43s             kubelet, pool-standard-4gb-2cpu-b9vc  Created container my-app
  Normal   Started    6m43s             kubelet, pool-standard-4gb-2cpu-b9vc  Started container my-app
  Warning  Unhealthy  9s                kubelet, pool-standard-4gb-2cpu-b9vc  Readiness probe failed: Get http://10.244.2.14:3000/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy  4s (x3 over 14s)  kubelet, pool-standard-4gb-2cpu-b9vc  Liveness probe failed: Get http://10.244.2.14:3000/: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
  Normal   Killing    4s                kubelet, pool-standard-4gb-2cpu-b9vc  Container crawler failed liveness probe, will be restarted

预计活性探测失败并重新启动容器。但为什么我会看到Readiness probe failed事件？

score 3 · Accepted Answer

正如@suren 在评论中所写，容器启动后仍会执行就绪探测。因此，如果同时定义了 liveness 和 readiness 探针（并且它们也是相同的），那么 readiness 和 liveness 探针都可能失败。

这是一个类似的问题，答案很明确。

score 2 · Accepted Answer

就绪探针用于确定容器是否准备好为请求提供服务。您的容器可以运行但未通过探测。如果未通过检查，则没有服务将重定向到此容器。

默认情况下，就绪探测的周期为 10 秒。

您可以在此处阅读更多信息：https ://docs.openshift.com/container-platform/3.9/dev_guide/application_health.html

score 0 · Accepted Answer

您为准备就绪和活跃度探测配置了相同的检查 - 因此，如果活跃度检查失败，则可以假设准备就绪也失败了。

score 0 · Accepted Answer

请在后端提供一个实现函数/方法，您可以将 /health 命名为 uri，并且可以在这里编写一个 liveness 逻辑，readiness 也可以是您的选择。

/health uri，应该与一个函数实现相关联，如果一切正常，它将返回 200 状态码，否则它会失败

kubernetes - 为什么 kubernetes 报告“就绪探测失败”和“活性探测失败”

4 回答 4

Related

Reference