deployment - Kubernetes 部署会导致停机

Question

运行部署时，我会停机。请求在可变时间（20-40 秒）后失败。

当 preStop 发送 SIGUSR1，等待 31 秒，然后发送 SIGTERM 时，入口容器的就绪检查失败。在该时间范围内，应从服务中删除 pod，因为就绪检查设置为在 2 次失败尝试（以 5 秒为间隔）后失败。

如何查看从服务中添加和删除 pod 的事件以找出导致此问题的原因？

围绕准备就绪的事件会自行检查吗？

我使用 Google Container Engine 1.2.2 版并使用 GCE 的网络负载均衡器。

服务：

apiVersion: v1
kind: Service
metadata:
  name: myapp
  labels:
    app: myapp
spec:
  type: LoadBalancer
  ports:
  - name: http
    port: 80
    targetPort: http
    protocol: TCP
  - name: https
    port: 443
    targetPort: https
    protocol: TCP  
  selector:
    app: myapp

部署：

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  strategy:
    type: RollingUpdate
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
        version: 1.0.0-61--66-6
    spec:
      containers:
      - name: myapp
        image: ****  
        resources:
          limits:
            cpu: 100m
            memory: 250Mi
          requests:
            cpu: 10m
            memory: 125Mi
        ports:
        - name: http-direct
          containerPort: 5000
        livenessProbe:
          httpGet:
            path: /status
            port: 5000
          initialDelaySeconds: 30
          timeoutSeconds: 1
        lifecycle:
          preStop:
            exec:
              # SIGTERM triggers a quick exit; gracefully terminate instead
              command: ["sleep 31;"]
      - name: haproxy
        image: travix/haproxy:1.6.2-r0
        imagePullPolicy: Always
        resources:
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 10m
            memory: 25Mi
        ports:
        - name: http
          containerPort: 80
        - name: https
          containerPort: 443
        env:
        - name: "SSL_CERTIFICATE_NAME"
          value: "ssl.pem"         
        - name: "OFFLOAD_TO_PORT"
          value: "5000"
        - name: "HEALT_CHECK_PATH"
          value: "/status"
        volumeMounts:
        - name: ssl-certificate
          mountPath: /etc/ssl/private
        livenessProbe:
          httpGet:
            path: /status
            port: 443
            scheme: HTTPS
          initialDelaySeconds: 30
          timeoutSeconds: 1
        readinessProbe:
          httpGet:
            path: /readiness
            port: 81
          initialDelaySeconds: 0
          timeoutSeconds: 1
          periodSeconds: 5
          successThreshold: 1
          failureThreshold: 2
        lifecycle:
          preStop:
            exec:
              # SIGTERM triggers a quick exit; gracefully terminate instead
              command: ["kill -USR1 1; sleep 31; kill 1"]
      volumes:
      - name: ssl-certificate
        secret:
          secretName: ssl-c324c2a587ee-20160331

score 1 · Accepted Answer

当探测失败时，探测器将发出一个警告事件，原因为 as Unhealthy，消息为xx probe errored: xxx。

您应该能够使用kubectl get events或kubectl describe pods -l app=myapp,version=1.0.0-61--66-6（按标签过滤 pod）找到这些事件。

deployment - Kubernetes 部署会导致停机

1 回答 1

Related

Reference