2

我正在构建一个云监视警报,以在 5 分钟内未调用 lambda 函数时发送电子邮件

    CloudWatchAlarm:
     Type: AWS::CloudWatch::Alarm
     Properties:
      AlarmActions:
        - !Ref SNSTopic
      AlarmDescription: Send email if lambda function was not called within 5 minutes
      Dimensions:
        -
          Name: "FunctionName"
          Value: "my-lambda"
      ComparisonOperator:  LessThanThreshold
      EvaluationPeriods: 1
      MetricName: Invocations
      Namespace: AWS/Lambda
      Period: 300
      Statistic: Sum
      Threshold: 1
      TreatMissingData: breaching
      DatapointsToAlarm: 1

因此,当调用该函数时,调用指标变为 1,警报进入 OK 状态。但是,当超过 5 分钟没有调用该函数时,警报不会回到 ALARM 状态。实际上,进入 ALARM 状态大约需要 15 分钟。

如果我设置了一个次要时间段,则返回警报状态确实需要更少的时间。我不明白期间是如何真正起作用的。

有谁知道这种配置在 Cloud Watch Alarm 中是否真的可行?我应该如何确定在 5 分钟内收到电子邮件的期限和评估期?

4

1 回答 1

2

This probably happens because alarm state is not evaluated using Period, but so called evaluation range which can be much longer the the period. What's more you do not control the evaluation range.

Similar issues of CW delays were discussed in, for example:

From the link:

In this case, for the time when alarm did not transition to OK state, it was using the previous data points in the evaluation range to evaluate its state, as expected.

So it seems that in your case the evaluation range reaches 15 minutes back.

于 2021-02-23T23:30:17.057 回答