1

我有一个自定义操作符,它监听我在 Kubernetes 集群中定义的 CRD 中的变化。

每当定义的自定义资源发生变化时,自定义运算符将协调并幂等地创建一个秘密(将由自定义资源拥有)。


我期望操作员仅在自定义资源或它拥有的秘密发生更改时才进行协调

我观察到的是,由于某种原因,该函数会以奇怪的间隔Reconcile触发集群上的每个 CR,而对相关实体没有可观察到的变化。我已经尝试专注于 CR 的特定实例,并遵循Reconcile需要它的时间。这些调用的间隔很奇怪。电话似乎在两个系列之间交替 - 一个从 10 小时开始,一次减少 7 分钟。另一个从 7 分钟开始,每次增长 7 分钟。

为了演示,在这些时间进行协调triggered(给或花几秒钟):

00:00
09:53 (10 hours - 1*7 minute interval)
10:00 (0 hours  + 1*7 minute interval)
19:46 (10 hours - 2*7 minute interval)
20:00 (0 hours  + 2*7 minute interval)
29:39 (10 hours - 3*7 minute interval)
30:00 (0 hours  + 3*7 minute interval)

每当递减间隔小于 7 小时时,它会重置为 10 小时间隔。与不断增长的系列相同 - 一旦间隔高于 3 小时,它就会重置回 7 分钟。


我的主要问题是如何调查 Reconcile 被触发的原因?

我在此处附上 CRD 的清单、操作员和 CR 的示例清单:

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.4.1
  creationTimestamp: "2021-10-13T11:04:42Z"
  generation: 1
  name: databaseservices.operators.talon.one
  resourceVersion: "245688703"
  uid: 477f8d3e-c19b-43d7-ab59-65198b3c0108
spec:
  conversion:
    strategy: None
  group: operators.talon.one
  names:
    kind: DatabaseService
    listKind: DatabaseServiceList
    plural: databaseservices
    singular: databaseservice
  scope: Namespaced
  versions:
  - name: v1alpha1
    schema:
      openAPIV3Schema:
        description: DatabaseService is the Schema for the databaseservices API
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            description: DatabaseServiceSpec defines the desired state of DatabaseService
            properties:
              cloud:
                type: string
              databaseName:
                description: Foo is an example field of DatabaseService. Edit databaseservice_types.go
                  to remove/update
                type: string
              serviceName:
                type: string
              servicePlan:
                type: string
            required:
            - cloud
            - databaseName
            - serviceName
            - servicePlan
            type: object
          status:
            description: DatabaseServiceStatus defines the observed state of DatabaseService
            type: object
        type: object
    served: true
    storage: true
    subresources:
      status: {}
status:
  acceptedNames:
    kind: DatabaseService
    listKind: DatabaseServiceList
    plural: databaseservices
    singular: databaseservice
  conditions:
  - lastTransitionTime: "2021-10-13T11:04:42Z"
    message: no conflicts found
    reason: NoConflicts
    status: "True"
    type: NamesAccepted
  - lastTransitionTime: "2021-10-13T11:04:42Z"
    message: the initial names have been accepted
    reason: InitialNamesAccepted
    status: "True"
    type: Established
  storedVersions:
  - v1alpha1


----

apiVersion: operators.talon.one/v1alpha1
kind: DatabaseService
metadata:
  creationTimestamp: "2021-10-13T11:14:08Z"
  generation: 1
  labels:
    app: talon
    company: amber
    repo: talon-service
  name: db-service-secret
  namespace: amber
  resourceVersion: "245692590"
  uid: cc369297-6825-4fbf-aa0b-58c24be427b0
spec:
  cloud: google-australia-southeast1
  databaseName: amber
  serviceName: pg-amber
  servicePlan: business-4

----

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "75"
    secret.reloader.stakater.com/reload: db-credentials
    simpledeployer.talon.one/image: <path_to_image>/production:latest
  creationTimestamp: "2020-06-22T09:20:06Z"
  generation: 77
  labels:
    simpledeployer.talon.one/enabled: "true"
  name: db-operator
  namespace: db-operator
  resourceVersion: "245688814"
  uid: 900424cd-b469-11ea-b661-4201ac100014
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: db-operator
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        name: db-operator
    spec:
      containers:
      - command:
        - app/db-operator
        env:
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: OPERATOR_NAME
          value: db-operator
        - name: AIVEN_PASSWORD
          valueFrom:
            secretKeyRef:
              key: password
              name: db-credentials
        - name: AIVEN_PROJECT
          valueFrom:
            secretKeyRef:
              key: projectname
              name: db-credentials
        - name: AIVEN_USERNAME
          valueFrom:
            secretKeyRef:
              key: username
              name: db-credentials
        - name: SENTRY_URL
          valueFrom:
            secretKeyRef:
              key: sentry_url
              name: db-credentials
        - name: ROTATION_INTERVAL
          value: monthly
        image: <path_to_image>/production@sha256:<some_sha>
        imagePullPolicy: Always
        name: db-operator
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: db-operator
      serviceAccountName: db-operator
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2020-06-22T09:20:06Z"
    lastUpdateTime: "2021-09-07T11:56:07Z"
    message: ReplicaSet "db-operator-cb6556b76" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  - lastTransitionTime: "2021-09-12T03:56:19Z"
    lastUpdateTime: "2021-09-12T03:56:19Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  observedGeneration: 77
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

笔记:

  • Reconcile 完成后,我返回:
return ctrl.Result{Requeue: false, RequeueAfter: 0}

所以这不应该是重复触发的原因。

  • 我要补充一点,我最近将 Kubernetes 集群版本更新为 v1.20.8-gke.2101。
4

0 回答 0