Service Dependency Not Working

Marty nagios-users at martycombs.com
Tue Jan 29 08:26:30 CET 2008


I have done further testing of the issue I am experiencing with service 
dependencies.  I originally thought that nagios had a problem with the 
number of dependencies we have (942).

I began re-testing last Friday by adding the service dependencies on a few 
hosts at a time and testing by shutting down NRPE on a specific target 
host and scheduling a check of all services on that host.  If I 
successfully received a single notification (i.e. NRPE Daemon CRITICAL), I 
would add a few more hosts, restart Nagios and repeat the test.

I was able to bring the number of service dependencies back up to 942 and 
receive a single notification (i.e. NRPE Daemon CRITICAL) as expected when 
NRPE was shutdown.

Today, (Monday) I decided to repeat the tests.  I shutdown NRPE on the 
test host and was sent 3 notifications before nagios reached the "NRPE 
Daemon" check.  Once nagios determined that NRPE was down, it stopped 
further checks of any remaining NRPE-dependent services.

My updated service dependency configuration is:

define servicedependency{
         dependent_host_name             test2
         dependent_service_description   __*
         host_name                       test2
         service_description             NRPE Daemon
         execution_failure_criteria      w,c,u,p
         notification_failure_criteria   w,c,u,p
}


I have configured nagios not to check dependent services if the master 
service (NRPE) is in anything other than an OK state.  However if a 
dependent service changes state, nagios fails to perform a check of the 
master service if it has previously done so on the check interval. 
Nagios performs checks, rechecks and sends notifications out about each 
dependent service as it reaches them in it's queue.  Further checks and 
notifications for dependent services only stop once the master check comes 
due within the scheduling queue.

I haven't looked at C code in a long while, but looking through 
bas/checks.c within the nagios source, it appears execution failure 
criteria are only examined before the service check is run.

If a service changes state nagios immediately verifies host up/down status 
with a host check following a change in service state.  After verifying 
host state, should nagios not also immediately re-check the state of 
master services?

Thanks again for your help.


Regards,
Marty Combs


On Fri, 25 Jan 2008, Marty wrote:

> I have been using nagios for several years with wonderful success.  (Thank you 
> to all who have contributed.)
>
> I have a problem with the service dependency not working as I expect.  We are 
> using Nagios v2.9 running on linux (CentOS).
>
> I am using NRPE to check several services such as load, users, procs, etc.
> I plan to make all NRPE dependent services begin with a double underscore
> (__) and use a regular expression in the service dependency to have
> the functioning of the NRPE daemon be the master service upon which
> all other NRPR services depend.  (Such as below.)
>
> ----------------------------------------------------------------------
> define servicedependency{
>        dependent_host_name             HOSTNAME
>        dependent_service_description   __*
>        host_name                       HOSTNAME
>        service_description             NRPE Daemon
>        execution_failure_criteria      n
>        notification_failure_criteria   w,c,u,p
> }
> ----------------------------------------------------------------------
>
> To confirm running of NRPE, I am using check_dummy
>
>  command[check_dummy]=/usr/local/nagios/libexec/check_dummy 0
>
>
> I have been running tests with no success.  I received notifications of the 
> dependendent services even if I shutdown NRPE on the target host.
>
> I have tried multiple iterations of the configuration.  I even simplified the 
> test by reconfiguring the service dependency stanza to be:
>
> ----------------------------------------------------------------------
> define servicedependency{
>        dependent_host_name             test2
>        dependent_service_description   __Total Processes
>        host_name                       test2
>        service_description             NRPE Daemon
>        execution_failure_criteria      n
>        notification_failure_criteria   w,c,u,p
> }
> ----------------------------------------------------------------------
>
> I run the simplified test by setting the critical value for the number of 
> processes very low and configure check_dummy to return CRITICAL.  Nagios sends 
> out two notifications when I schedule a check of services on host "test2".
>
> I have used service dependencies in the past on older versions of Nagios 
> (v1.x).  Is there something I am overlooking?
>
> Many thanks for any help I can get.
>
>
> marty :-)
>
> ----------
> To announce that there must be no criticism of the President, or that
> we are to stand by the President, right or wrong, is not only
> unpatriotic and servile, but is morally treasonable to the American
> public.
>
>  -Theodore Roosevelt, 26th US President (1858-1919)
>
>

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list