Monitoring software RAID in Debian GNU/Linux

Hari Sekhon hpsekhon at googlemail.com
Thu May 1 14:55:54 CEST 2008


Daniel,

   Could you pls try with the latest version 0.7 on nagiosexchange and 
let us know if you still have any problem with it?

Thanks

-h


Daniel Guillermo Bareiro wrote:
> Hi Hari!
>
>   
>>> [ About plugin check_md_raid.pl ]
>>>       
>
>   
>>> I have noticed when removing a member from the raid, running the plugin 
>>> from Nagios server, it show the RAID in degraded state for a moment. But
>>> soon the plugin shows the RAID in OK state. I observe the same situation
>>> when adding the member. Initially the script show the RAID in rebuilding
>>> state, but a new run of the plugin shows OK status before finishing the 
>>> ebuilding. Which can be the problem?
>>>       
>
>   
>> show us the output of the plugin with -vvv as well as the mdadm detail 
>> output. This plugin basically calls mdadm to find the status of the
>> drives.
>>     
>
> Initial status:
>
> xenhost7:/usr/local/nagios/libexec/non-std# cat /proc/mdstat
> Personalities : [raid1]
> md2 : active raid1 sda3[0] sdb3[1]
>       292053568 blocks [2/2] [UU]
>
> md1 : active raid1 sdb2[1] sda2[0]
>       19534976 blocks [2/2] [UU]
>
> md0 : active raid1 sda1[0] sdb1[1]
>       979840 blocks [2/2] [UU]
>
> unused devices: <none>
>
> xenhost7:/usr/local/nagios/libexec/non-std# ./check_md_raid.pl -vvv
> finding all MD arrays via mdadm --detail --scan
> found array /dev/md0
> found array /dev/md1
> found array /dev/md2
> Now testing raid device "/dev/md0"
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:04 2008
>      Raid Level : raid1
>      Array Size : 979840 (957.04 MiB 1003.36 MB)
>     Device Size : 979840 (957.04 MiB 1003.36 MB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 11:42:28 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 956ff562:7ba15903:6068b800:4a673998
>          Events : 0.2
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> Now testing raid device "/dev/md1"
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:15 2008
>      Raid Level : raid1
>      Array Size : 19534976 (18.63 GiB 20.00 GB)
>     Device Size : 19534976 (18.63 GiB 20.00 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Apr 30 16:14:26 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 119bcdde:0ac351b2:d05e4382:e6def213
>          Events : 0.9440
>
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       8       18        1      active sync   /dev/sdb2
> Now testing raid device "/dev/md2"
> /dev/md2:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:25 2008
>      Raid Level : raid1
>      Array Size : 292053568 (278.52 GiB 299.06 GB)
>     Device Size : 292053568 (278.52 GiB 299.06 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 18:31:36 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 8d954d13:82dabd5b:096fe691:7819c2d2
>          Events : 0.8
>
>     Number   Major   Minor   RaidDevice State
>        0       8        3        0      active sync   /dev/sda3
>        1       8       19        1      active sync   /dev/sdb3
> RAID OK: All arrays OK
>
> xenhost7:/usr/local/nagios/libexec/non-std# mdadm --detail /dev/md{0..2}
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:04 2008
>      Raid Level : raid1
>      Array Size : 979840 (957.04 MiB 1003.36 MB)
>     Device Size : 979840 (957.04 MiB 1003.36 MB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 11:42:28 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 956ff562:7ba15903:6068b800:4a673998
>          Events : 0.2
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:15 2008
>      Raid Level : raid1
>      Array Size : 19534976 (18.63 GiB 20.00 GB)
>     Device Size : 19534976 (18.63 GiB 20.00 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Apr 30 16:18:50 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 119bcdde:0ac351b2:d05e4382:e6def213
>          Events : 0.9440
>
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       8       18        1      active sync   /dev/sdb2
> /dev/md2:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:25 2008
>      Raid Level : raid1
>      Array Size : 292053568 (278.52 GiB 299.06 GB)
>     Device Size : 292053568 (278.52 GiB 299.06 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 18:31:36 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 8d954d13:82dabd5b:096fe691:7819c2d2
>          Events : 0.8
>
>     Number   Major   Minor   RaidDevice State
>        0       8        3        0      active sync   /dev/sda3
>        1       8       19        1      active sync   /dev/sdb3
>
> Removing /dev/sdb2 from /dev/md1:
>
> xenhost7:/usr/local/nagios/libexec/non-std# mdadm --manage /dev/md1 --fail /dev/sdb2; \
>   
>> mdadm --manage /dev/md1 --remove /dev/sdb2; \
>> date; \
>> ./check_md_raid.pl -vvv
>>     
> mdadm: set /dev/sdb2 faulty in /dev/md1
> mdadm: hot removed /dev/sdb2
> mié abr 30 16:27:55 ART 2008
> finding all MD arrays via mdadm --detail --scan
> found array /dev/md0
> found array /dev/md1
> found array /dev/md2
> Now testing raid device "/dev/md0"
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:04 2008
>      Raid Level : raid1
>      Array Size : 979840 (957.04 MiB 1003.36 MB)
>     Device Size : 979840 (957.04 MiB 1003.36 MB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 11:42:28 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 956ff562:7ba15903:6068b800:4a673998
>          Events : 0.2
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> Now testing raid device "/dev/md1"
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:15 2008
>      Raid Level : raid1
>      Array Size : 19534976 (18.63 GiB 20.00 GB)
>     Device Size : 19534976 (18.63 GiB 20.00 GB)
>    Raid Devices : 2
>   Total Devices : 1
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Apr 30 16:27:55 2008
>           State : active, degraded
>  Active Devices : 1
> Working Devices : 1
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 119bcdde:0ac351b2:d05e4382:e6def213
>          Events : 0.9445
>
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       0        0        1      removed
> RAID CRITICAL: Array MD1 is in state "active, degraded" (raid1)
> Now testing raid device "/dev/md2"
> /dev/md2:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:25 2008
>      Raid Level : raid1
>      Array Size : 292053568 (278.52 GiB 299.06 GB)
>     Device Size : 292053568 (278.52 GiB 299.06 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 18:31:36 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 8d954d13:82dabd5b:096fe691:7819c2d2
>          Events : 0.8
>
>     Number   Major   Minor   RaidDevice State
>        0       8        3        0      active sync   /dev/sda3
>        1       8       19        1      active sync   /dev/sdb3
>
> xenhost7:/usr/local/nagios/libexec/non-std# mdadm --detail /dev/md{0..2}
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:04 2008
>      Raid Level : raid1
>      Array Size : 979840 (957.04 MiB 1003.36 MB)
>     Device Size : 979840 (957.04 MiB 1003.36 MB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 11:42:28 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 956ff562:7ba15903:6068b800:4a673998
>          Events : 0.2
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:15 2008
>      Raid Level : raid1
>      Array Size : 19534976 (18.63 GiB 20.00 GB)
>     Device Size : 19534976 (18.63 GiB 20.00 GB)
>    Raid Devices : 2
>   Total Devices : 1
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Apr 30 16:29:01 2008
>           State : clean, degraded
>  Active Devices : 1
> Working Devices : 1
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 119bcdde:0ac351b2:d05e4382:e6def213
>          Events : 0.9458
>
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       0        0        1      removed
> /dev/md2:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:25 2008
>      Raid Level : raid1
>      Array Size : 292053568 (278.52 GiB 299.06 GB)
>     Device Size : 292053568 (278.52 GiB 299.06 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 18:31:36 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 8d954d13:82dabd5b:096fe691:7819c2d2
>          Events : 0.8
>
>     Number   Major   Minor   RaidDevice State
>        0       8        3        0      active sync   /dev/sda3
>        1       8       19        1      active sync   /dev/sdb3
>
> xenhost7:/usr/local/nagios/libexec/non-std# date
> mié abr 30 16:31:28 ART 2008
> xenhost7:/usr/local/nagios/libexec/non-std# ./check_md_raid.pl -vvv
> finding all MD arrays via mdadm --detail --scan
> found array /dev/md0
> found array /dev/md1
> found array /dev/md2
> Now testing raid device "/dev/md0"
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:04 2008
>      Raid Level : raid1
>      Array Size : 979840 (957.04 MiB 1003.36 MB)
>     Device Size : 979840 (957.04 MiB 1003.36 MB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 11:42:28 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 956ff562:7ba15903:6068b800:4a673998
>          Events : 0.2
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> Now testing raid device "/dev/md1"
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:15 2008
>      Raid Level : raid1
>      Array Size : 19534976 (18.63 GiB 20.00 GB)
>     Device Size : 19534976 (18.63 GiB 20.00 GB)
>    Raid Devices : 2
>   Total Devices : 1
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Apr 30 16:31:27 2008
>           State : clean, degraded
>  Active Devices : 1
> Working Devices : 1
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 119bcdde:0ac351b2:d05e4382:e6def213
>          Events : 0.9490
>
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       0        0        1      removed
> Now testing raid device "/dev/md2"
> /dev/md2:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:25 2008
>      Raid Level : raid1
>      Array Size : 292053568 (278.52 GiB 299.06 GB)
>     Device Size : 292053568 (278.52 GiB 299.06 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 18:31:36 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 8d954d13:82dabd5b:096fe691:7819c2d2
>          Events : 0.8
>
>     Number   Major   Minor   RaidDevice State
>        0       8        3        0      active sync   /dev/sda3
>        1       8       19        1      active sync   /dev/sdb3
> RAID OK: All arrays OK
>
> Momentarily it shows CRITICAL state.
>
> Rebuilding:
>
> xenhost7:/usr/local/nagios/libexec/non-std# date
> mié abr 30 16:44:15 ART 2008
> xenhost7:/usr/local/nagios/libexec/non-std# mdadm /dev/md1 --add /dev/sdb2; ./check_md_raid.pl -vvv
> mdadm: re-added /dev/sdb2
> finding all MD arrays via mdadm --detail --scan
> found array /dev/md0
> found array /dev/md1
> found array /dev/md2
> Now testing raid device "/dev/md0"
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:04 2008
>      Raid Level : raid1
>      Array Size : 979840 (957.04 MiB 1003.36 MB)
>     Device Size : 979840 (957.04 MiB 1003.36 MB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 11:42:28 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 956ff562:7ba15903:6068b800:4a673998
>          Events : 0.2
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> Now testing raid device "/dev/md1"
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:15 2008
>      Raid Level : raid1
>      Array Size : 19534976 (18.63 GiB 20.00 GB)
>     Device Size : 19534976 (18.63 GiB 20.00 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Apr 30 16:50:27 2008
>           State : clean, degraded, recovering
>  Active Devices : 1
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 1
>
>  Rebuild Status : 0% complete
>
>            UUID : 119bcdde:0ac351b2:d05e4382:e6def213
>          Events : 0.9626
>
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        1       8       18        1      spare rebuilding   /dev/sdb2
> Now testing raid device "/dev/md2"
> /dev/md2:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:25 2008
>      Raid Level : raid1
>      Array Size : 292053568 (278.52 GiB 299.06 GB)
>     Device Size : 292053568 (278.52 GiB 299.06 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 18:31:36 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 8d954d13:82dabd5b:096fe691:7819c2d2
>          Events : 0.8
>
>     Number   Major   Minor   RaidDevice State
>        0       8        3        0      active sync   /dev/sda3
>        1       8       19        1      active sync   /dev/sdb3
> RAID OK: All arrays OK
>
> xenhost7:/usr/local/nagios/libexec/non-std# mdadm --detail /dev/md{0..2}
> /dev/md0:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:04 2008
>      Raid Level : raid1
>      Array Size : 979840 (957.04 MiB 1003.36 MB)
>     Device Size : 979840 (957.04 MiB 1003.36 MB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 0
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 11:42:28 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 956ff562:7ba15903:6068b800:4a673998
>          Events : 0.2
>
>     Number   Major   Minor   RaidDevice State
>        0       8        1        0      active sync   /dev/sda1
>        1       8       17        1      active sync   /dev/sdb1
> /dev/md1:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:15 2008
>      Raid Level : raid1
>      Array Size : 19534976 (18.63 GiB 20.00 GB)
>     Device Size : 19534976 (18.63 GiB 20.00 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 1
>     Persistence : Superblock is persistent
>
>     Update Time : Wed Apr 30 16:53:51 2008
>           State : clean, degraded, recovering
>  Active Devices : 1
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 1
>
>  Rebuild Status : 74% complete
>
>            UUID : 119bcdde:0ac351b2:d05e4382:e6def213
>          Events : 0.9650
>
>     Number   Major   Minor   RaidDevice State
>        0       8        2        0      active sync   /dev/sda2
>        2       8       18        1      spare rebuilding   /dev/sdb2
> /dev/md2:
>         Version : 00.90.03
>   Creation Time : Thu Apr 10 17:46:25 2008
>      Raid Level : raid1
>      Array Size : 292053568 (278.52 GiB 299.06 GB)
>     Device Size : 292053568 (278.52 GiB 299.06 GB)
>    Raid Devices : 2
>   Total Devices : 2
> Preferred Minor : 2
>     Persistence : Superblock is persistent
>
>     Update Time : Mon Apr 28 18:31:36 2008
>           State : clean
>  Active Devices : 2
> Working Devices : 2
>  Failed Devices : 0
>   Spare Devices : 0
>
>            UUID : 8d954d13:82dabd5b:096fe691:7819c2d2
>          Events : 0.8
>
>     Number   Major   Minor   RaidDevice State
>        0       8        3        0      active sync   /dev/sda3
>        1       8       19        1      active sync   /dev/sdb3
>
> It shows OK state during reconstruction.
>
> Thanks in advance for your response.
>
> Regards,
> Daniel
>   


-- 
Hari Sekhon


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list