2

我正在寻找一种解决方案来使用 Datadog 监控 GCP Dataflow 管道以提取内置指标以及 Beam 自定义指标。目前 Datadog 为其他 GCP 服务提供集成,但不为 Dataflow 提供集成。有没有人做过类似的工作并且可以分享如何将其构建为自定义解决方案的指针?

4

1 回答 1

1

目前我只看到两种可能性:

  1. 使用来自 google-cloud-clients/google-cloud-monitoring 的客户端以及与 Datadog 的 stackdriver 集成使用 GCP 自定义指标
  2. 使用部署在云中的 datadog 代理并使用 Datadog StatsD 客户端(Java、Python、Go)连接到它

  1. 使用 GCP 自定义指标 https://cloud.google.com/monitoring/custom-metrics/creating-metrics 和 datadog 与 GCP 的集成 https://www.datadoghq.com/product/integrations/#cat-google-cloud

    final MetricServiceClient client = MetricServiceClient.create();
    ProjectName name = ProjectName.of(projectId);
    
    MetricDescriptor descriptor = MetricDescriptor.newBuilder()
        .setType(metricType)
        .setDescription("This is a simple example of a custom metric.")
        .setMetricKind(MetricDescriptor.MetricKind.GAUGE)
        .setValueType(MetricDescriptor.ValueType.DOUBLE)
        .build();
    
    CreateMetricDescriptorRequest request = CreateMetricDescriptorRequest.newBuilder()
        .setName(name.toString())
        .setMetricDescriptor(descriptor)
        .build();
    
    client.createMetricDescriptor(request);
    
  2. 使用 datadog statsd 客户端,java one - https://github.com/DataDog/java-dogstatsd-client这样您就可以在 GCP 上部署 datadog 代理并通过它进行连接。使用 Kubernetes 进行示例。 https://docs.datadoghq.com/tracing/setup/kubernetes/#deploy-agent-daemonset

    import com.timgroup.statsd.ServiceCheck;
    import com.timgroup.statsd.StatsDClient;
    import com.timgroup.statsd.NonBlockingStatsDClient;
    
    public class Foo {
    
      private static final StatsDClient statsd = new NonBlockingStatsDClient(
        "my.prefix",                          /* prefix to any stats; may be null or empty string */
        "statsd-host",                        /* common case: localhost */
        8125,                                 /* port */
        new String[] {"tag:value"}            /* Datadog extension: Constant tags, always applied */
      );
    
      public static final void main(String[] args) {
        statsd.incrementCounter("foo");
        statsd.recordGaugeValue("bar", 100);
        statsd.recordGaugeValue("baz", 0.01); /* DataDog extension: support for floating-point gauges */
        statsd.recordHistogramValue("qux", 15);     /* DataDog extension: histograms */
        statsd.recordHistogramValue("qux", 15.5);   /* ...also floating-point */
        statsd.recordDistributionValue("qux", 15);     /* DataDog extension: global distributions */
        statsd.recordDistributionValue("qux", 15.5);   /* ...also floating-point */
    
        ServiceCheck sc = ServiceCheck
              .builder()
              .withName("my.check.name")
              .withStatus(ServiceCheck.Status.OK)
              .build();
        statsd.serviceCheck(sc); /* Datadog extension: send service check status */
    
        /* Compatibility note: Unlike upstream statsd, DataDog expects execution times to be a
         * floating-point value in seconds, not a millisecond value. This library
         * does the conversion from ms to fractional seconds.
         */
        statsd.recordExecutionTime("bag", 25, "cluster:foo"); /* DataDog extension: cluster tag */
      }
    }
    

    用于 kubernetes 的 datadog deployment.yaml

    apiVersion: extensions/v1beta1
    kind: DaemonSet
    metadata:
      name: datadog-agent
    spec:
      template:
        metadata:
          labels:
            app: datadog-agent
          name: datadog-agent
        spec:
          serviceAccountName: datadog-agent
          containers:
          - image: datadog/agent:latest
            imagePullPolicy: Always
            name: datadog-agent
            ports:
              - containerPort: 8125
                # hostPort: 8125
                name: dogstatsdport
                protocol: UDP
              - containerPort: 8126
                # hostPort: 8126
                name: traceport
                protocol: TCP
            env:
              - name: DD_APM_ENABLED
                value: "true"
              - name: DD_API_KEY
                value: "<YOUR_API_KEY>"
              - name: DD_COLLECT_KUBERNETES_EVENTS
                value: "true"
              - name: DD_LEADER_ELECTION
                value: "true"
              - name: KUBERNETES
                value: "yes"
              - name: DD_KUBERNETES_KUBELET_HOST
                valueFrom:
                  fieldRef:
                    fieldPath: status.hostIP
            resources:
              requests:
                memory: "256Mi"
                cpu: "200m"
              limits:
                memory: "256Mi"
                cpu: "200m"
            volumeMounts:
              - name: dockersocket
                mountPath: /var/run/docker.sock
              - name: procdir
                mountPath: /host/proc
                readOnly: true
              - name: cgroups
                mountPath: /host/sys/fs/cgroup
                readOnly: true
            livenessProbe:
              exec:
                command:
                - ./probe.sh
              initialDelaySeconds: 15
              periodSeconds: 5
          volumes:
            - hostPath:
                path: /var/run/docker.sock
              name: dockersocket
            - hostPath:
                path: /proc
              name: procdir
            - hostPath:
                path: /sys/fs/cgroup
              name: cgroups
    

目前我正在调查这个,所以我不知道如何做到这一点。

于 2018-08-17T13:18:03.593 回答