Nagios as a Service Resiliency Manager

Christopher McAtackney cristoir at gmail.com
Thu Dec 10 18:08:46 CET 2009


Hi all,

I have a need to control an Active / Passive pair of components and
was wondering if anyone had tackled this problem with Nagios?

The scenario is as follows;

Host A has SERVICE_1 installed and running. Host B has SERVICE_2
installed, but not running.

The desired functionality is to detect when SERVICE_1 is not running
(or that Host A is down / unreachable), and then to start SERVICE_2 on
Host B.

I believe I can do this with Nagios by defining an event handler on
SERVICE_1 which will make the appropriate call to start SERVICE_2 on
Host B

Would it make sense to store the relationship between SERVICE_1 and
Host B / SERVICE_2 as a service macro, e.g.
$_SERVICE_PASSIVE_HOSTNAME, $_SERVICE_PASSIVE_SERVICENAME?

There are too many scenarios in which the SERVICE_1 might come back up
to try automate the switching off of SERVICE_2 I believe, e.g. if
someone pulled a network cable on Host A accidently, then plugged it
in 15 minutes later - during which time Nagios detects that it is down
and so starts up SERVICE_2. The user then plugs the network lead back
in and now we have two Active instances running - which is what we
specifically wanted to avoid. Even if Nagios detects that the primary
component is up, it's still too late because any Active / Active
overlap will cause problems for this particular application.

I can't think of any way to automate that side of things - but does
the general concept of having Nagios start up a Passive partner make
sense?

Thanks for any insight you have,

Chris

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list