check_load gone crazy

Marc Powell lists at xodus.org
Wed Sep 8 13:53:29 CEST 2010


On Sep 7, 2010, at 7:39 PM, Mike Chesnut wrote:

> I'm wondering if this is a known bug, and/or if anybody else has seen 
> similar behavior...
> 
> We're using Nagios 3.2.1 on Linux, monitoring several Linux systems.  We 
> run the check_load probe against every system.  Occasionally (at 
> non-regular intervals), Nagios will freak out and alert on the load 
> average of many (sometimes *all*) systems.  When this occurs, it reports 
> the *same* load averages for each system, and the weirdest part is that 
> these load averages are completely bogus.

> Any ideas for what I can do to get to the bottom of why this happens?

What transport mechanism are you using to run check_load on the remote systems? It is not 'network aware' and so the check_load binary must be installed on each remote machine and run on that machine via some transport (check_nrpe, check_by_ssh, etc). It seems to me that you are not running it on the remote machines but are instead monitoring the load of your nagios machine multiple times. Can you show a couple example service{} and corresponding command{} definitions if you are unsure?

--
Marc
------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list