NSCA in standalone single-process daemon mode

Andreas Ericsson ae at op5.se
Wed May 3 10:19:25 CEST 2006


Thomas Guyot-Sionnest wrote:
> Hi list,
> 
> I'm running a big Nagios monitoring system which has about a hundred of
> remote passive checks reporting trough NSCA. Lately when I added more
> passive checks I noticed that the number of "Failed" checks (No results
> received) increased (For most of the checks it's impossible to say if it did
> run or not).
> 
> I'm currently running NSCA in inetd mode using D. J. Bernstein's tcpserver
> program. Since most checks are run by Vixie Cron, and therefore will run at
> the exact same time, my two guess were that either:
> 
> 1. I'm jamming up the monitoring server for more that 10 seconds will all
> the checks.
> 
> Or 
> 
> 2. All NSCA processes writing at the same command file trigger some obscure
> OS or Nagios bug.
> 
> I have reasons to think it's not #1, so to test #2 I wanted to run NSCA in
> single-process daemon mode. When I do this it get the first passive check
> correctly and send_nsca fail on all other checks. Running strace I see that
> it block on the poll syscall after processing the first check, and send_nsca
> timeouts after 10 seconds.
> 
> I'm running Nagios 2.0b3 on Slackware 10.1.0, Dual Athlon MP with 4G of ram,
> NSCA Version 2.6, Official & unpatched.
> 
> Compiled with Gcc:
> Configured with: ../gcc-3.3.4/configure --prefix=/usr --enable-shared
> --enable-threads=posix --enable-__cxa_atexit --disable-checking
> --with-gnu-ld --verbose --target=i486-slackware-linux
> --host=i486-slackware-linux
> Thread model: posix
> gcc version 3.3.4
> 
> Any thoutht on what's going wrong here?
> 

Nagios' command-file is being filled up. It can only hold 4096 bytes 
(hard OS limit on most unix-like systems) so with 100+ checks going off 
at the same time you're lucky to get half of them written to the pipe 
before it times out.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642




More information about the Developers mailing list