2

我正在尝试为以下情况构建警报:如果 FailedRequest 的数量大于收到的请求的 99%,我想在 15 分钟内发出警报。我写了一个 KQL 查询,但不幸的是,即使没有发生真正的问题,它也会触发,即没有真正得到大于 99% 的条件。以下是查询,我确定我在其中犯了一些愚蠢的错误,有什么帮助吗?

修复上述查询的任何帮助,因此它仅在关键时才真正给出结果,即当收到的所有请求都失败时。

requests 
| where cloud_RoleName == 'ABCDEF_cloudRName' and resultCode != '404' 
| summarize FailedPercent=((countif(success == false))/count() by timestamp, cloud_RoleName, appName)*100 
| where FailedPercent > 99 
| project RelatedCI='XYZZZ',AlarmTime=timestamp,Category="Cloud-Azure-Monitor",SubCategory="Application",Object=appName ,"Value of Metric","Percentage Failed Requests"," is ", FailedPercent
4

1 回答 1

3

是当失败百分比大于 xx% 时发送警报的类似问题。

我只是写一个查询,如果不符合您的需要,请随时修改:

requests
| where resultCode != "404" and success == "False" 
| summarize exceptionsCount =count()
| extend a = "a"
| join
(
    requests
    | where resultCode != "404" 
    | summarize requestsCount =count()
    | extend a = "a"
)
on a
| project isFail = 1.0 * exceptionsCount / requestsCount > 0.99 //check if the failed percentage is greater than 99%.
| project rr=iff(isFail, "Fail","Pass" ) 
| where rr=="Fail"

查询代码准备好后,您可以按照上述问题中的步骤创建基于查询的警报。

于 2019-06-06T07:40:55.133 回答