3.0b5: External commands are not turned into passive checks after a while

Andreas Ericsson ae at op5.se
Sun Oct 14 00:42:14 CEST 2007


When starting a new thread, please don't do that as a reply to a
previous message. It fscks up threaded reading of this mailing
list enormously.

Some other day I'll take the time to actually read the message.

Steffen Poulsen wrote:
> Hi,
> 
> We are experiencing a problem with nagios commands not being processed
> correctly after ~30hours of uptime at our master server. This server
> _only_ receives check results through NSCA, it does no active checking.
> 
> The server receives 5276 external commands / 5 mins, according to its
> performance data. And the first 30 hours of uptime, this is what the
> Passive Service Checks stats also reflects.
> 
> But at some point the command processing stops, and all the nagios
> server sees, is the external command, there is no log line for the
> passive check being handled.
> 
> A few notes on this condition: Using the "top" utility we notice memory
> usage bumps up and down between using practically none and all available
> memory (one cycle every second or so):
> 
>   3169 nagios     1   0    0  631M  559M cpu/15   0:06  4.19% nagios
> 
> (Normal scenario (the first 30 hours) is nagios using ~25mb of memory):
> 
>   4481 nagios     2   0    0   23M   20M cpu/0    1:34  1.60% nagios
> 
> When this condition appears, it is not enough to start and stop nagios -
> we have to clean out the checkresults directory also.
> 
> It contains files like this:
> 
> -rw-------   1 nagios   nagios    439458 Oct 13 21:34 cywyyBl
> 
> (Some are quite large, up to a mb).
> 
> Other side effects of this condition:
> 
>    * Nagios doesn't notice freshness checks that gets stale (it
> recognizes stale checks after it starts again
>    * Nagios doesn't update status.dat (cgis show stale information)
>    * As checks are not recognized, no performance data and other check
> releated stuff is processed 
> 
> This is a Sun T1000 w. Solaris 10, Nagios 3.0b5 compiled with gcc.
> 
> Any ideas appreciated.
> 
> Best regards,
> Steffen Poulsen
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel


-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/




More information about the Developers mailing list