timeouts when using secondary dns

Az az at whoever.org
Tue Nov 7 11:24:55 CET 2006


stucky wrote:
> I use the check_by_ssh plugin for most of my stuff and I noticed that 
> if the primary nameserver is unavailable nagios starts freaking out.
> All of a sudden all plugins time out. I tested it using the 'host' 
> command and it only takes about 1 second longer to lookup hosts using 
> the secondary nameserver.
> The default timeout for check_by_ssh is 10 seconds. I cranked it up to 
> 30 and still I get timeouts. I'm not sure I understand that one.
> Has anyone else seen this.
We had a similar issue in that our primary DNS was doing strange things, 
and it quite often took 5 or even 10 seconds to perform a DNS lookup. 
What we were seeing was 70% of service checks (and subsequently host 
checks) failing by timing out. The key was the multiple of 5 seconds. 
The resolver timeout on, say, RHEL3 is based on RES_TIMEOUT in 
resolv.h... which was 5 seconds.

We added the following to our resolv.conf, and found the problems went away:

    options timeout:2 rotate

This sets the timeout for waiting for a reply to 2 seconds, and tells 
the resolve to rotate through your 'nameserver' entries rather than 
always hitting #1, then #2, etc.

Cheers.





-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list