check_snmp_load.pl best linux practices

Robert Eden rmeden at gmail.com
Wed Mar 9 21:33:13 CET 2011


I'm currently experimenting with using check_snmp_load.pl to alarm on system overload.

Monitoring CPU usage is giving me a lot of false alarms due to their instantaneous nature.

I'm getting good results by using the NETSL option to report load averages.  I'm setting '-c 99,4,10' to basically ignore the 1 minute value and alarm 
on 5 and 15 minutes.

Unfortunately, unlike the CPU percentages,  the load numbers should be based on the number of processors.  The NETSL option doesn't do that.

One option is to have a series of service commands based on the number of processors, but  I'm considering writing a new mode that will using the 
"STAND" option to get the number of CPUs and then use that as a multiplication factor for alarms.

Does that make sense?   Surely others have run into this problem.  How do you alarm on excessive load w/o causing lots of false alarms.

Robert





------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list