0

有人可以帮助我如何添加 Nagios 逻辑来捕捉警报到我的下面的 python 脚本吗?

我尝试为所有 OK 和 CRITICAL 添加 sys.exit(0) 和 sys.exit(1),或者请让我知道我应该做什么,以便该脚本在运行 Nagios 时捕获 0、1、2 并显示信息。

#!/usr/bin/python
import subprocess
import os, sys


#Check python present or not
#  dnf install python3.6-stack
# export PATH=/opt/python-3.6/bin:$PATH

def check_MegaRaid():
   # Next script
   failed=subprocess.run(["sudo /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo \ -aALL | grep -i 'Failed Disks' | awk -F':' '{print $2}'"], shell=True, stdout=subprocess.PIPE, universal_newlines=True)
   failed_status = failed.stdout
   print("failed_status is",failed_status)
   critical=subprocess.run(["sudo /opt/MegaRAID/MegaCli/MegaCli64 -AdpAllInfo \ -aALL | grep -i 'Critical Disks' | awk -F':' '{print $2}'"], shell=True, stdout=subprocess.PIPE, universal_newlines=True)
   critical_status = critical.stdout
   print("critical_status is",critical_status)

   if failed_status.strip() and critical_status.strip() == "0" :
       print("Raid check all OK" )
       sys.exit(0)
       #return 0

   else:
       print("CRITICAL")
       sys.exit(1)
       #return 1




def check_raid():
   process=subprocess.run(["sudo /sbin/mdadm --detail /dev/md127 | grep -i state | grep -w clean, | awk -F',' '{print $2}' |sed -e 's/^[ \t]*//' "], shell=True, stdout=subprocess.PIPE, universal_newlines=True)
   output = process.stdout
   check_process=subprocess.run(["sudo /sbin/mdadm --detail /dev/md127 | grep -i state | awk -F':' '{print $2}' |sed -e 's/^[ \t]*//' "], shell=True, stdout=subprocess.PIPE, universal_newlines=True)
   check = check_process.stdout
   if output.strip() == 'degraded':
       print("Raid disk state is CRITICAL ",output)
       #return 1
       sys.exit(1)

       
   elif check.strip() == 'clean':
       print("Raid check all OK")
       #return 0
       sys.exit(0)
   else:
       print("sudo /sbin/mdadm --detail /dev/md127 cmd not found : This  is an dataraid machine")
       check_MegaRaid()

#Check whether system configure raid
process=subprocess.run(["sudo cat /GEO_VERSION | grep -i raid | awk -F'Layout:' '{print $2}' | sed 's/[0-9]*//g' | sed -e 's/^[ \t]*//'"], shell=True, stdout=subprocess.PIPE, universal_newlines=True)
raid_value = process.stdout

if raid_value.strip() == 'raid':
   print("System configure Raid functions")
   check_raid()
else:
   print("There is no raid configured in this system")
   exit()
4

1 回答 1

0

如果您有兴趣,请参考https://nagios-plugins.org/doc/guidelines.html 。

0 正常 1 警告 2 严重 3 未知

所以你需要做的第一件事就是sys.exit(1)用一个sys.exit(2)

我还将exit()用 a 替换该 final 以sys.exit(3)表明它是一个 Unknown exit,这将帮助您识别 UI 中配置错误的服务。

您还需要首先指示状态,典型的单行插件输出将如下所示:

STATUS: message | perfdata

但它看起来不像您正在使用性能数据,因此将您的关键出口更改为以字符CRITICAL:和您的 OK 状态为前缀OK:

于 2021-05-30T19:47:56.197 回答