(Service Check Timed Out) returns critical

Michael Markstaller mm at elabnet.de
Mon Nov 18 12:48:50 CET 2002


Hi,

I'm using nagios to check approx 100 hosts and 350 services working fine
so far. 
I'm asking myself if it's possible to tell nagios to report "unknown"
instead of critical if a service check times out ? I tried to set the
"service_check_timeout" in nagios.cfg to 30 to have nagios kill
non-responsive service-checks quicker in case of a high load due to many
unreachable hosts (see below) but this resulted in getting dozens of
cirtical-alerts due to (Service Check Timed Out) with check_snmp.
Because I'd prefer to get "unknown" in case of any plugin-timeout error
not resulting in a retrieved value. Or maybe this problem is located
within check_snmp ?

The hosts are mostly routers and quite distributed, so I have made
dependencies for all hosts to get a notification only on the host
failing but this doesn't work so well like I think it should. If for
instance the first router on which all others are depending fails,
nagios messes quite up with a few hundred processes for pending checks
and gives me many false alerts instead of the causing the problem. 
Anybody with some general giudeline to help getting useful alerts when
something "core" fails (like the switch the nagios-server is attached to
or DNS etc.)

Thanks,

Michael Markstaller

Elaborated Networks GmbH 
www.elabnet.de 
Lise-Meitner-Str. 1, D-85662 Hohenbrunn, Germany 
fon: +49-8102-8951-60, fax: +49-8102-8951-80  


-------------------------------------------------------
This sf.net email is sponsored by: To learn the basics of securing 
your web site with SSL, click here to get a FREE TRIAL of a Thawte 
Server Certificate: http://www.gothawte.com/rd524.html




More information about the Users mailing list