How can I control host checks after service failure

Brian Snead BSnead at infosysnetworks.com
Tue Jul 15 22:51:46 CEST 2003


I am polling many services across a wan connection and have the check interval set to 3 minutes. When a service check fails, Nagios starts pounding out sequential host checks to see if the host is alive. The problem is that I want to give the host several minutes to make sure we a) fault isolate any parents that have failed and b) don't fire off false positives because of network load.
While reading about how Nagios handles service check scheduling, I came across this note -- "Also of note - when Nagios is check the status of a host, it holds off on doing anything else (executing new service checks, processing other service check results, etc)."

So, if I want three minutes before the host is reported as down and the check-host-alive command times out in 10 seconds, I can set the max_check_attempts to 18 (18 x 10sec). But I will not process any other service checks for 3 minutes, so they will become stale. 

If you can think of some alternatives, please let me know.

Thanks
Brian.


-------------------------------------------------------
This SF.Net email sponsored by: Parasoft
Error proof Web apps, automate testing & more.
Download & eval WebKing and get a free book.
www.parasoft.com/bulletproofapps1
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list