Fix for host dependency checks

Holger Weiss holger at CIS.FU-Berlin.DE
Fri Feb 24 19:04:32 CET 2006


* Holger Weiss <holger at CIS.FU-Berlin.DE> [2006-01-30 16:54]:
> There is a timing problem in the host[*] dependency check logic: If host
> B is configured to be dependent on host A being up and host A goes down,
> the dependency will only fail if host A "incidentally" was checked
> _prior_ to host B after going down.  Hence, the host dependency logic
> will sometimes work and sometimes not.  I'd therefore suggest to
> explicitly (re-)check host A during the dependency checking for host B,
> as the attached patch does.

Okay, this introduces a new problem: If host B is checked immediately
before and host A (during the dependency check) after a recovery of both
hosts, the dependency won't fail.  Hence, notifications for host B won't
be suppressed (been there, got the t-shirt).

Next try: The attached patch lets the dependency fail if either the
current or the previous (hard) state of A matches the failure criteria.
AFAICS, this should reliably suppress notifications for host B if the
dependency fails.

Holger

-- 
PGP fingerprint:  F1F0 9071 8084 A426 DD59  9839 59D3 F3A1 B8B5 D3DE
-------------- next part --------------
Index: checks.c
===================================================================
RCS file: /cvsroot/nagios/nagios/base/checks.c,v
retrieving revision 1.84
diff -u -r1.84 checks.c
--- checks.c	23 Feb 2006 20:28:52 -0000	1.84
+++ checks.c	24 Feb 2006 18:00:33 -0000
@@ -1527,6 +1527,7 @@
 int check_host_dependencies(host *hst,int dependency_type){
 	hostdependency *temp_dependency;
 	host *temp_host;
+	int route_result;
 
 #ifdef DEBUG0
 	printf("check_host_dependencies() start\n");
@@ -1544,14 +1545,17 @@
 		if(temp_host==NULL)
 			continue;
 
+		/* check the host we depend on */
+		route_result=verify_route_to_host(temp_host,CHECK_OPTION_FORCE_EXECUTION);
+
 		/* is the host we depend on in state that fails the dependency tests? */
-		if(temp_host->current_state==HOST_UP && temp_dependency->fail_on_up==TRUE)
+		if((route_result==HOST_UP || temp_host->last_hard_state==HOST_UP) && temp_dependency->fail_on_up==TRUE)
 			return DEPENDENCIES_FAILED;
-		if(temp_host->current_state==HOST_DOWN && temp_dependency->fail_on_down==TRUE)
+		if((route_result==HOST_DOWN || temp_host->last_hard_state==HOST_DOWN) && temp_dependency->fail_on_down==TRUE)
 			return DEPENDENCIES_FAILED;
-		if(temp_host->current_state==HOST_UNREACHABLE && temp_dependency->fail_on_unreachable==TRUE)
+		if((route_result==HOST_UNREACHABLE || temp_host->last_hard_state==HOST_UNREACHABLE) && temp_dependency->fail_on_unreachable==TRUE)
 			return DEPENDENCIES_FAILED;
-		if((temp_host->current_state==HOST_UP && temp_host->has_been_checked==FALSE) && temp_dependency->fail_on_pending==TRUE)
+		if((route_result==HOST_UP && temp_host->has_been_checked==FALSE) && temp_dependency->fail_on_pending==TRUE)
 			return DEPENDENCIES_FAILED;
 
 		/* immediate dependencies ok at this point - check parent dependencies if necessary */


More information about the Developers mailing list