azure - Azure Log Analytics 指标衡量警报

Question

我有一个日志查询，例如，

example_cl
| top 1 by TimeGenerated desc
| project in_use, unused, total = (in_use + unused)

这给了我一个简单的输出；

in_use  unused  total
  75     45      120

我希望为此查询设置一个指标警报，以便当 in_use 超过总数的 90% 时，它会发送电子邮件警报

在尝试发出警报时，我总是给出以下错误

Search Query should contain 'AggregatedValue' and 'bin(TimeGenerated, [roundTo])' for Metric alert type

需要帮助确定我们对此特定指标警报的正确查询。

score 0 · Accepted Answer

要添加到@KrishnaG-MSFT，如果您不想将平均值用作聚合值，则可以使用 count() 之类的聚合函数，它将单个结果视为唯一值并呈现结果。

example_cl
| top 1 by TimeGenerated desc
| project in_use, unused, total = (in_use + unused)
| summarize AggregatedValue= count() by xxxxxxx, bin(TimeGenerated, 30s)

更多的例子我是如何重写的

日志警报

Event
| where EventID == 1235
| project Computer,  TimeGenerated,  AlertType_s = "Test Connectrix",  Severity = 4,  
SeverityName_s = "Information",  AffectedCI_s = Computer ,  AlertTitle_s = 
strcat(Computer, ":Test Connectrix"  ) ,  AlertDetails_s = RenderedDescription

用指标测量在上面重新编写 Log Alert

观察对返回的行数进行的聚合。

Event
| where EventID == 1235
| project Computer,  TimeGenerated,  AlertType_s = "Test Connectrix",  Severity = 4,  
SeverityName_s = "Information",  AffectedCI_s = Computer ,  AlertTitle_s = 
strcat(Computer, ":Test Connectrix"  ) ,  AlertDetails_s = RenderedDescription
| summarize AggregatedValue = count()  by bin(TimeGenerated, 30m) , Computer

Metric 测量样本 perf(CPU) 表的另一个示例

let _maxValue = 80; 
let _timeWindow = 4h; 
let _AvgCpu = Perf 
| where TimeGenerated >= ago(_timeWindow) 
| where CounterName == "% Processor Time" and InstanceName =~ "_Total"  
| summarize mtgPerf=max(TimeGenerated), CounterValue=round(avg(CounterValue)), 
SampleCount= count(CounterValue) by Computer, InstanceName, CounterName, ObjectName; 
_AvgCpu 
| where CounterValue > _maxValue 
| project      Computer     , ObjectName     , CounterName     , InstanceName     , 
TimeGenerated=mtgPerf     , CounterValue     , AlertType_s = "Sustained High CPU 
Utilization"     , Severity = 4     , SeverityName_s = "WARNING"     , AffectedCI_s = 
strcat(Computer, "/CPUPercent/", InstanceName)     , AlertTitle_s = strcat(Computer, 
": Sustained High CPU Utilization")     , AlertDetails_s = strcat("Computer: ", 
Computer, "Average CPU Utilization: ", CounterValue, "%Sample Period: Last ", 
_timeWindow, "Sample Count: ", SampleCount, "Alert Threshold: > ", _maxValue, "%")
| summarize AggregatedValue = count() by bin(TimeGenerated, 30m), Computer , 
ObjectName , CounterName , InstanceName, CounterValue, AlertType_s, Severity, 
SeverityName_s, AffectedCI_s , AlertTitle_s, AlertDetails_s

希望这可以帮助。

score 0 · Accepted Answer

通常，当您将警报逻辑“基于”参数选择为“度量标准”时，您会收到此类与 AggregatedValue 相关的错误。

对于所有 Metric 测量警报规则，请参阅此 -> https://docs.microsoft.com/en-us/azure/azure-monitor/platform/alerts-unified-log#metric-measurement-alert-rules Microsoft 文档关联。

您必须更新您的查询，如下所示。请注意，以下示例查询中的 xxxxxxx 是组字段记录。要了解您在该领域可能需要使用的内容，请参阅上面提供的 Microsoft 文档链接。

example_cl
| top 1 by TimeGenerated desc
| project in_use, unused, total = (in_use + unused)
| summarize AggregatedValue= avg(in_use) by xxxxxxx, bin(TimeGenerated, 30s)

希望这可以帮助！！干杯！！

azure - Azure Log Analytics 指标衡量警报

2 回答 2

日志警报

用指标测量在上面重新编写 Log Alert

Metric 测量样本 perf(CPU) 表的另一个示例

Related

Reference