我需要将 AWS Backup 事件(特别是失败的备份和 Windows VSS 备份失败的备份)发送到集中式 Opsgenie 警报系统。AWS 指示我们使用 EventBridge 解析 AWS Backups 生成的 JSON 对象,以确定 VSS 部分是否失败。
SNS 不是一个可行的选择,因为我们不能在一个过滤策略中将两条规则“或”在一起,而且我们只有一个端点,因此对同一主题的两个订阅将覆盖一个。也就是说,我确实通过 SNS 成功地向 OpsGenie 发送了消息。到目前为止,使用 Eventbridge,我还没有运气。
我已经开始在 terraform 中实现其中的大部分内容。我意识到 TF 对使用 EventsBridge 有一些限制(我的两个规则不能绑定到我创建的自定义总线;我必须手动执行此步骤。此外,我需要手动创建 Opsgenie API 集成,因为 Opsgenie 似乎没有支持对于“EventBridge”类型。似乎只有与 SNS 相关的旧版本 Cloudwatch 事件存在。以下是我的 terraform 供参考:
# This module creates an opsgenie team and will tie in existing emails to the team to use with the integration.
module "opsgenie_team" {
source = "app.terraform.io/etc.../opsgenie"
version = "1.1.0"
team_name = "test team"
team_description = "test environment."
team_admin_emails = var.opsgenie_team_admins
team_user_emails = var.opsgenie_team_users
suppress_cloudwatch_events_notifications = var.opsgenie_suppress_cloudwatch_events_notifications
suppress_cloudwatch_notifications = var.opsgenie_suppress_cloudwatch_notifications
suppress_generic_sns_notifications = var.opsgenie_suppress_generic_sns_notifications
}
# Step commented out since 'Webhook' doesn't work.
#
# resource "opsgenie_api_integration" "opsgenie" {
# name = "api-based-int-2"
# type = "Webhook"
#
# responders {
# type = "user"
# id = data.opsgenie_user.test.id
# }
#
# enabled = true
# allow_write_access = true
# suppress_notifications = false
# webhook_url = module.opsgenie_team.cloudwatch_events_integration_sns_endpoint
# }
resource "aws_cloudwatch_event_api_destination" "opsgenie" {
name = "Test"
description = "Connection to OpsGenie"
invocation_endpoint = module.opsgenie_team.cloudwatch_events_integration_sns_endpoint
http_method = "POST"
invocation_rate_limit_per_second = 20
connection_arn = aws_cloudwatch_event_connection.opsgenie.arn
}
resource "aws_cloudwatch_event_connection" "opsgenie" {
name = "opsgenie-event-connection"
description = "Connection to OpsGenie"
authorization_type = "API_KEY"
# Verified key seems to be valid on integration API
# https://api.opsgenie.com/v2/integrations
auth_parameters {
api_key {
key = module.opsgenie_team.cloudwatch_events_integration_id
value = module.opsgenie_team.cloudwatch_events_integration_api_key
}
}
}
# Opsgenie ID created with the manual integration step.
data "aws_cloudwatch_event_source" "opsgenie" {
name_prefix = "aws.partner/opsgenie.com/MY-OPSGENIE-ID"
}
resource "aws_cloudwatch_event_bus" "opsgenie" {
name = data.aws_cloudwatch_event_source.opsgenie.name
event_source_name = data.aws_cloudwatch_event_source.opsgenie.name
}
# Two rules I need to filter on, commented out as they cannot be tied to a custom bus with
# terraform.
# resource "aws_cloudwatch_event_rule" "opsgenie_backup_failures" {
# name = "capture-generic-backup-failures"
# description = "Capture all other backup failures"
#
# event_pattern = <<EOF
# {
# "State": [
# {
# "anything-but": "COMPLETED"
# }
# ]
# }
# EOF
# }
#
# resource "aws_cloudwatch_event_rule" "opsgenie_vss_failures" {
# name = "capture-vss-failures"
# description = "Capture VSS Backup failures"
#
# event_pattern = <<EOF
# {
# "detail-type" : [
# "Windows VSS Backup attempt failed because either Instance or SSM Agent has invalid state or insufficient privileges."
# ]
# }
# EOF
# }
事件总线和 API 目标似乎已正确创建,我可以找到用于与 Opsgenie 通信的 API 密钥,并在邮递员中使用它来访问 Opsgenie 端点。我手动创建规则并将它们绑定到自定义总线。我什至让它们保持打开状态,希望能捕捉到任何 AWS 备份事件——目前还没有。
我觉得我很接近,但缺少一个关键细节(或两个)。任何帮助是极大的赞赏。