How to ignore "Socket timeout" notifications, anyone?

MT Morales nagios at pensol.net
Wed Jan 14 17:38:07 CET 2004


Here is some info of my setup:
nagios 1.1
kernel 2.4.20-20.8smp,rh8.0(recompiled from source)
~2000 active checks 
500 hosts

Every time I make configuration changes and reload nagios (i.e. using 
the reload option), active checks to hosts with 10-20 services 
collision (i.e. nagios scheduler stops doing smart checks) and multiple 
checks are executed at the exact same second to the same host (yap, 
many of our hosts can handled very limited number of simultaneous 
connections on the same tcp port, and it would be too much work to fix 
this issue on every one of them at this point.

And yes, we have tweaked all the paramaters we could possibly tweak in 
nagios.cfg. We have even recompiled the kernel in rh8.0 to solve the 
pipe-wait issue by making the pipe size larger, and has helped a lot.

This problem clears on the next round of checks but, in the meantime, 
our on duty operator gets a bunch of CRITICAL alarms notifications 
about "Socket timeout error.." on services.

I was thinking of a couple of work arounds:
-Modify the check plugin to return a different status code for socket 
timeouts
-Modify the notify command(s) to filter out alerts with "Socket 
timeout" messages.

So before I try to reinvent the wheel, could someone suggest a 
simpler/cleaner approach to the problem?
TIA
-tomas







-------------------------------------------------------
This SF.net email is sponsored by: Perforce Software.
Perforce is the Fast Software Configuration Management System offering
advanced branching capabilities and atomic changes on 50+ platforms.
Free Eval! http://www.perforce.com/perforce/loadprog.html
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list