Nagios processes hang

Andreas Ericsson ae at op5.se
Mon Sep 17 00:45:47 CEST 2007


Marantz, Roy wrote:
> I'm running Nagios 2.8 with around 1400 hosts and around 14000 services
> defined.  I have about 700 active service and the rest come in via nsca.
> 
> My problem has a few symptoms:
> 1) I collect defunct Nagios processes, around 300 per day
> 2) the command pipe stops getting read so nsca is dumping data to its
> dump file
> 3) active service checks have very long (hours) latency
> 
> These all sound like the same problem to me, but I don't know how to
> diagnose it.  Any help would be appreciated.  I have run nagios -s and
> it doesn't suggest anything.  I'm using check_fping for host checks and
> my remaining active service checks.  Attached is the output from nagios
> -v and my nagios.cfg.  Thanks in advance for any help.


The trouble is the FIFO, which holds a maximum of 4096 bytes by default,
meaning it quickly becomes a bottleneck. Nagios tries to empty it as soon
as there's data available on it, but fails to keep up with the data-spam
from nsca.

You could try re-nicing the nagios process, which might make it capable
of staying ahead of nsca.

Otherwise you could try modifying the FIFO size and recompile the kernel.

Alternatively, patch nagios and nsca to use a unix socket and use
setsockopt() to up the read/write buffer on that socket to 256 KiB.

The fourth, and possibly tricksiest alternative, is to rewrite nsca as a
neb-module, have it run in a separate thread and update nagios' status
data directly. This last method will scale best but is by far the most
difficult.

Good luck

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list