Dependencies question

Patrick M. Hausen hausen at punkt.de
Mon Jul 9 21:09:08 CEST 2007


Hi all!

Foremost: I've searched the online manual and the FAQ.
I have not searched the mailing list archives for an
answer to my question. I apologise in advance, but I
found it rather difficult to come up with keywords to
search for.

OK, here's the challenge:

Imagine a couple of services or a couple of hosts, that
are, well, coupled - by some device or dependency that itself
cannot be monitored my Nagios.

E.g. I want to monitor a couple of hosts connected to the
same switch but the switch is not even accessable on Layer 3
or they are in some remote computing centre, but I cannot
point at a single router on which they all depend. Yet they
all depend on the hosting centre's "Internet connection".

Or imagine virtual web servers. For each of them there are
chances that it fails individually. Probably just because someone
messes up an update of the content. OTOH all of them may fail at
once, because the database server fails.
OK, I can monitor the database server. Now they fail because the
web server itself gets too loaded.
Monitor the web server's CPU. Possibly.
Now they fail because ... I simply cannot think of everything in
advance.

Now, what's the problem? Our alarm system sends messages to the
pagers of the staff on duty. Which costs money. Not much, but
tiny amounts add up.

So, what I'd like to implement is this policy:

"Here's a group of <somethings>, i.e. hosts or services.
 If any number of them fails, just send a notification for
 the first one. I will look at the Nagios status page anyway and
 probably something else connecting them in some way failed."

Is this possible? From reading the docs it seems to me like dependencies
form a directed graph and circles are probably a bad thing. And they
do not solve the problem. You would need to have every service in
the group depend on every other one.

I have not used event handlers or other advanced mechanisms, yet.

So: I think that someone else must have faced the same problem
before - is there any "best practice" document, if the above
policy can be implemented at all?

Thanks a lot,
Patrick
-- 
punkt.de GmbH * Vorholzstr. 25 * 76137 Karlsruhe
Tel. 0721 9109 0 * Fax 0721 9109 100
info at punkt.de       http://www.punkt.de
Gf: Jürgen Egeling      AG Mannheim 108285

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list