Dependencies in redundant networks and services: a idea for Nagios 4

Andreas Ericsson ae at op5.se
Mon May 23 09:34:54 CEST 2011


On 05/22/2011 10:12 PM, Matthew Pounsett wrote:
> 
> Searching back through the archives it seems that the issue of
> handing service and host dependencies on redundant services or hosts
> comes up from time to time (actually, far less often than I would
> have expected) and nobody seems to have a really good solution to the
> problem.
> 
> Imagine a web service (call it W) which depends on two separate
> databases (call them database A and database B), where both databases
> have redundant backups and the web service can contact either the
> primary or backup for each database and still do its job (A1, A2, B1,
> B2).  Doing this without a more flexible dependency system requires
> either some very complicated combinatorial setup where we have W1
> dependent on A1,B1, W2 dependent on A1,B2, W3 on A2,B1, etc.  or one
> very complicated custom check script which implements the
> dependencies itself.
> 
> I've been thinking about this a fair bit over the last couple of
> weeks since I manage a network and suite of services where nearly
> everything is redundant, and almost no single outage of any component
> results in an 'unreachable' state for any other component.  I'd very
> much like to avoid having to run all kinds of duplicate checks and
> train the rest of my staff to ignore alerts unless they arrive in
> pairs.
> 
> I think  I've hit upon an idea, but it's a fairly significant change
> to the way service and host dependencies work today, and so I don't
> think it's reasonable to pursue it any earlier than Nagios 4.0, but
> I'd like to get some feedback to see if others think this might be
> the right way to go (and I'm hoping I don't get too many TL;DRs).
> 
> In a nutshell, my idea is to separate the definition of the master
> service/host from the association to it by the dependent
> service/host, and make the association by reference from the
> dependent service or host definition... much the same way as a
> service is associated to a host by reference.
> 
> There are two big wins from doing this: 1) If the dependency is
> created by reference from the service or host definition, that opens
> the door to using a boolean syntax in that reference, allowing both
> simple *and* complex dependencies. 2) Moving the dependency
> association into the service or host definition also allows the
> association to be applied to services or hosts by
> servicegroup/hostgroup which simplifies configuration file
> authoring.
> 
> Here's one example where using a hostgroup for the master service (or
> a list of hosts) contains the implicit assumption that all of the
> services referenced in a single servicedependency definition are
> redundancies of each other.  I don't like doing anything by
> implication, but this provides a match to the current implication
> that all master services referenced by a dependent are not
> redundancies of each other, and keeps the configuration very simple.
> 
> 
> define service { host_name           web-host service_description Web
> Service W dependencies        db-a-dependency,db-b-dependency }
> 
> define hostgroup { hostgroup_name      database-hosts members
> db-host-1,db-host-2 }
> 
> define service { hostgroup_name      database-hosts 
> service_desription  Database A }
> 
> define service { hostgroup_name      database-hosts 
> service_desription  Database B }
> 
> define servicedependency { servicedependency_name
> db-a-dependency hostgroup_name                  database-hosts 
> service_description             Database A 
> notification_failure_criteria   w,u,c,p dependency_period
> 24x7 }
> 
> define servicedependency { servicedependency_name
> db-b-dependency hostgroup_name                  database-hosts 
> service_description             Database B 
> notification_failure_criteria   w,u,c,p dependency_period
> 24x7 }
> 
> Since the implication by using a hostgroup_name or a list of hosts in
> the servicedependency definition is that the referenced services are
> redundant, the servicedependency doesn't 'fail' until all of the
> referenced services meet *any* of the notifcation_failure_criteria
> (e.g. one being w, and another being u means the servicedependency
> fails).  Matched with the implication in the 'dependencies' directive
> in W's service definition that those listed dependencies are not
> redundancies of each other, and you have the following boolean
> statement about database failures that determines whether W gets
> notifications:
> 
> (db-host-1:Database A&&  db-host-2:Database A) || (db-host-1:Database
> B&&  db-host-1:Database B)
> 
> But as I said I don't like the idea of doing anything by
> implication... I'd like the relationships to be explicit, and so I'm
> working on a way that the boolean statement about dependencies could
> be written out in the dependencies directive in any host or service
> definition.  I have a few ideas, but none are quite as clean as the
> above example so I'll exclude them from this email for now (it's
> already too long).  But if people are supportive of the general
> concept I can keep working on it until I come up with a syntax that
> is both flexible *and* manageable.
> 
> Does this seem like a direction people would like to pursue?
> 

Well... no actually. Changing how servicedependencies work is not a
good idea. It would be far better (for Nagios 4) to implement a
cluster-object and be able to set cluster-objects as parents for
services (and hosts). That way we get something similar to how the
various business process addons work today, but implemented in-core
and without breaking servicedependencies for everyone.

I agree that dependencies should have been specified somewhat like
you mentioned if it had been done that way from the start, but
right now it's too late to change how they work and what they do,
as people find good use for them the way they work already.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
What Every C/C++ and Fortran developer Should Know!
Read this article and learn how Intel has extended the reach of its 
next-generation tools to help Windows* and Linux* C/C++ and Fortran 
developers boost performance applications - including clusters. 
http://p.sf.net/sfu/intel-dev2devmay




More information about the Developers mailing list