The below is driving me crazy.
I submit the whole function for clarity. The aim here is to log an error every time a Linux mirror sub-device is faulty OR when a sub-device has been removed. I just need to have one message every time. mdadm commands output always displays a line with removed and a line with faulty if the sub-device has not been completely removed from the mirror and is in faulted state. I just want to log only the sub-device faulty part in this case and not the removed one.
One the device has been removed from the mirror, there is no faulty error any more, but only removed. In this case I need to log the removed error.
For that I use a variable $myfaulty
and parse the mdadm --detail $md_dev
output.
The problem is that the part of the code that comes first is evaluated second.
The below code comes first:
elsif ($subdevice_status =~ /faulty/) {
_msg("ERROR: MD device $device status $device_status, sub-device status $subdevice_status " );
_mylog('err', "ERROR: faulty MD $device $device_status: sub-device $subdevice ($subdevice_role): status $subdevice_status " );
$myfaulty = 1;
And this comes second:
elsif ($subdevice_status =~ /removed/ && $myfaulty != 1 ) {
_msg("ERROR: MD device $device status $device_status, sub-device status $subdevice_status " );
printf("The removed part: $myfaulty, $device, $raid_type \n");
_mylog('err', "ERROR: MD device $device status $device_status, sub-device $subdevice_status " );
The $myfaulty variable evaluation happens first on the second block of code and then on the first.
Any ideas?
Thanks, George
Sub code
# checks all MD volumes it finds on the system
sub check_md() {
my $MDADMCMD = "";
my $device = "";
my $subdevice = "";
my $device_status = "";
my $subdevice_status = "";
my $raid_type = "";
my $subdevice_role = "";
my $raid_count = 0;
my $lines = 0;
my $myfaulty = 0;
foreach(@mdadm_paths) {
if( -e $_ ) { $MDADMCMD = $_; _debug("using mdadm in $MDADMCMD"); last; }
}
if($MDADMCMD eq "") { _msg("MD not configured on this system (mdadm not found) - "); return; }
open MDSCAN, "-|", "$MDADMCMD --detail --scan --verbose 2>&1" || die "can't run: $!";
while(<MDSCAN>) {
$lines++;
if($_ =~ /^ARRAY/) {
$_ =~ /^ARRAY\s+(\S+)\s+level=(\S+)\s+/;
$device = $1;
$raid_type = $2;
$raid_count++;
_debug("found MD device $device, raid-level $raid_type");
open MDDETAIL, "-|", "$MDADMCMD --detail $device" || die "can't run: $!"; while(<MDDETAIL>) {
printf($_);
if($_ =~ /\s+State\s+:\s+(.+)$/) { # md device status
$device_status = $1;
chomp($device_status);
_debug("device status: $device_status");
if($device_status !~ /clean$/) {
change_status("CRITICAL");
$md_status = "CRITICAL";
}
if($_ =~ /^\s+(\d+|-)\s+(\d+|-)\s+(\d+|-)\s+(\d+|-)\s+(\w+)\s+(\w+)\s+(\S+)\s*$/) {
$subdevice_status = $5; $subdevice_role = $6; $subdevice = $7;
_debug("device:$device - subdevice:$subdevice - role:$subdevice_role - status:$subdevice_status");
if($subdevice_status !~ /active/) {
change_status("CRITICAL");
$md_status = "CRITICAL";
_msg("MD $device, sub-device $subdevice ($subdevice_role): status $subdevice_status - ");
if ($subdevice_role =~ /rebuilding/) {
_mylog('warning', "WARNING: MD $device $device_status: sub-device $subdevice ($subdevice_role): status $subdevice_status " );
}
elsif ($subdevice_status =~ /faulty/) {
_msg("ERROR: MD device $device status $device_status, sub-device status $subdevice_status " );
_mylog('err', "ERROR: faulty MD $device $device_status: sub-device $subdevice ($subdevice_role): status $subdevice_status " );
$myfaulty = 1;
printf("Faulty is here: $myfaulty, $subdevice_status, $raid_type \n");
}
elsif ($subdevice_status =~ /removed/ && $myfaulty != 1 ) {
_msg("ERROR: MD device $device status $device_status, sub-device status $subdevice_status " );
printf("The removed part: $myfaulty, $device, $raid_type \n");
_mylog('err', "ERROR: MD device $device status $device_status, sub-device $subdevice_status " );
}
else {
_msg("MD $device, sub-device $subdevice ($subdevice_role): status $subdevice_status - ");
_mylog('err', "ERROR: after faulty MD $device $device_status: sub-device $subdevice ($subdevice_role): status $subdevice_status " );
}
}
else { _verbosemsg("MD $device, sub-device $subdevice ($subdevice_role): status OK - "); }
# Need to also catch subdevices that have been removed and don't show up anymore.
}
if ($_ =~ /^\s+(\d+|-)\s+(\d+|-)\s+(\d+|-)\s+(\d+|-)\s+(\w+)\s*$/) {
$subdevice_status = $5;
_debug("device:$device - subdevice:$subdevice - role:$subdevice_role - status:$subdevice_status");
printf("The removed part: $myfaulty, $device, $raid_type \n");
if ($subdevice_status =~ m/removed/ && $myfaulty != 1 ) {
_msg("ERROR: MD device $device status $device_status, sub-device status $subdevice_status " );
printf("The removed part: $myfaulty, $device, $raid_type \n");
_mylog('err', "ERROR: MD device $device status $device_status, sub-device $subdevice_status " );
}
}
} #while(<MDDETAIL>)
close(MDDETAIL);
}
}
close(MDSCAN);
# no md devices found, but command output wasn't empty
if($raid_count == 0 && $lines > 0) { _msg("MD status is UNKNOWN (can't get configuration info) - "); }
elsif($raid_count == 0) { _msg("MD not configured on this system - "); }
elsif($md_status eq "OK" && not defined $verbOutput) { _msg("MD Status is OK - "); } }
mdadm --detail output
/dev/md0:
Version : 0.90
Creation Time : Mon Mar 4 12:53:19 2013
Raid Level : raid1
Array Size : 521984 (509.84 MiB 534.51 MB)
Used Dev Size : 521984 (509.84 MiB 534.51 MB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Thu Jul 4 16:27:46 2013
State : clean, degraded
Active Devices : 1
Working Devices : 1
Failed Devices : 1
Spare Devices : 0
UUID : 3a3bd078:31678889:9485a7cf:e1283d32
Events : 0.438
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 33 1 active sync /dev/sdc1
2 8 17 - faulty spare /dev/sdb1