Fix for host dependency checks
Holger Weiss
holger at CIS.FU-Berlin.DE
Fri Feb 24 19:04:32 CET 2006
* Holger Weiss <holger at CIS.FU-Berlin.DE> [2006-01-30 16:54]:
> There is a timing problem in the host[*] dependency check logic: If host
> B is configured to be dependent on host A being up and host A goes down,
> the dependency will only fail if host A "incidentally" was checked
> _prior_ to host B after going down. Hence, the host dependency logic
> will sometimes work and sometimes not. I'd therefore suggest to
> explicitly (re-)check host A during the dependency checking for host B,
> as the attached patch does.
Okay, this introduces a new problem: If host B is checked immediately
before and host A (during the dependency check) after a recovery of both
hosts, the dependency won't fail. Hence, notifications for host B won't
be suppressed (been there, got the t-shirt).
Next try: The attached patch lets the dependency fail if either the
current or the previous (hard) state of A matches the failure criteria.
AFAICS, this should reliably suppress notifications for host B if the
dependency fails.
Holger
--
PGP fingerprint: F1F0 9071 8084 A426 DD59 9839 59D3 F3A1 B8B5 D3DE
-------------- next part --------------
Index: checks.c
===================================================================
RCS file: /cvsroot/nagios/nagios/base/checks.c,v
retrieving revision 1.84
diff -u -r1.84 checks.c
--- checks.c 23 Feb 2006 20:28:52 -0000 1.84
+++ checks.c 24 Feb 2006 18:00:33 -0000
@@ -1527,6 +1527,7 @@
int check_host_dependencies(host *hst,int dependency_type){
hostdependency *temp_dependency;
host *temp_host;
+ int route_result;
#ifdef DEBUG0
printf("check_host_dependencies() start\n");
@@ -1544,14 +1545,17 @@
if(temp_host==NULL)
continue;
+ /* check the host we depend on */
+ route_result=verify_route_to_host(temp_host,CHECK_OPTION_FORCE_EXECUTION);
+
/* is the host we depend on in state that fails the dependency tests? */
- if(temp_host->current_state==HOST_UP && temp_dependency->fail_on_up==TRUE)
+ if((route_result==HOST_UP || temp_host->last_hard_state==HOST_UP) && temp_dependency->fail_on_up==TRUE)
return DEPENDENCIES_FAILED;
- if(temp_host->current_state==HOST_DOWN && temp_dependency->fail_on_down==TRUE)
+ if((route_result==HOST_DOWN || temp_host->last_hard_state==HOST_DOWN) && temp_dependency->fail_on_down==TRUE)
return DEPENDENCIES_FAILED;
- if(temp_host->current_state==HOST_UNREACHABLE && temp_dependency->fail_on_unreachable==TRUE)
+ if((route_result==HOST_UNREACHABLE || temp_host->last_hard_state==HOST_UNREACHABLE) && temp_dependency->fail_on_unreachable==TRUE)
return DEPENDENCIES_FAILED;
- if((temp_host->current_state==HOST_UP && temp_host->has_been_checked==FALSE) && temp_dependency->fail_on_pending==TRUE)
+ if((route_result==HOST_UP && temp_host->has_been_checked==FALSE) && temp_dependency->fail_on_pending==TRUE)
return DEPENDENCIES_FAILED;
/* immediate dependencies ok at this point - check parent dependencies if necessary */
More information about the Developers
mailing list