Check_Openmanage not ignoring non-certified drives

Trond Hasle Amundsen t.h.amundsen at usit.uio.no
Mon Jan 14 11:02:27 CET 2013


"Bob The Junkie" <bob_the_junkie at hotmail.com> writes:

> I m using Nagios and Check_Openmange to keep an eye on some Dell R710 servers
> we ve recently acquired, and I m having problems trying to stop warnings with
> non-dell certified drives appearing in the alert log.
>
> I ve separated out the different components on the servers to check into their
> own nagios checks   so my config files appear as such:
>
> In nagios:
>
> SERVICES.CFG
>
> <host>
>
> Check_command check_dell_components!memory
>
> <host>
>
> Check_command check_dell_components!alertlog
>
> COMMANDS.CFG
>
> Command_name Check_dell_components
>
> Command_line check_nrpe  H $HOSTADDRESS$ -p 5666  t 30  c Check_OpenManage  a  
>  only $ARG1$ 
>
> On each Server in nsclient.ini:
>
> Check_OpenManage = scripts\\check_openmanage.exe $ARG1$ --perfdata
>
> The problem I m having is that in one of my checks that checks the health of
> the alert log, I m getting a consistent warning message (Alert log content: 0
> critical, 6 non-critical, 36 ok ). I ve traced this down to the 6 non-dell
> certified drives in the server, and I can indeed see within OMSA that the only
> 6 warnings all state  Controller event log: PD 04(e0x20/s4) is not a certified
> drive: Controller 0 (PERC 6/i Integrated) .
>
> So far, so good. Reading through the documentation I can see the
> Check_Openmanage includes a blacklisting option specifically for this event  
> pdisk_cert - Suppress warning message about non-certified physical disk  but no
> matter what I try, I can t seem to get Check_Openmanage to ignore these
> problems. An example of the command I m running on the command line is:
>
> check_openmanage.exe -s -a -b pdisk_cert=all
>
> Which returns:
>
> WARNING: Alert log content: 0 critical, 6 non-critical, 36 ok
>
> Now I m assuming the problem here is being caused by the Alert Log generating
> the errors, and not the physical disk directly causing the errors, which is why
> blacklisting the certificate problem on the physical disk isn t doing me any
> good.
>
> Which leads me onto my question   is there anything I can do to ignore these
> errors (and thus stop Nagios from complaining) apart from excluding the alert
> log when I do my checks?

Hi,

Your analysis is correct. The check_openmanage plugin's check of the log
content is limited to counting the number of critical, warning and ok
messages. It doesn't do any log parsing. The intended usage of the log
checking is as a precausion, if you're concerned about missing some
temporary problem. After all, the plugin does active checking and will
only report the state of the hardware right now.

In your case I think that the easiest solution would be to stop using
the log checking with check_openmanage, and either use a fully fledged
log parsing plugin (such as check_logfiles) or write your own simple
plugin where you just filter out the certificate stuff.

Regards,
-- 
Trond H. Amundsen <t.h.amundsen at usit.uio.no>
Center for Information Technology Services, University of Oslo

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. SALE $99.99 this month only -- learn more at:
http://p.sf.net/sfu/learnmore_122412
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list