Creative use of check_cluster

Rusch, Daniel Daniel.Rusch at GlobalCrossing.com
Tue Sep 24 17:28:30 CEST 2002


All,

Like many of you, we need to monitor Systems which are comprised of many
servers some in clusters and some providing standalone mission critical
services, so I thought I would share what we did with the check_cluster
pluggin:

We found that if you have 500+ services defined it can be hard to tell if a
"system" is up when a few services are critical.  In other words, suppose
that Host/Service A is in a cluster (the cluster is mission critical not the
individual members of the cluster) and Host/Service Z is mission critical.
If Host/Service A goes critical, a central monitoring point, with out
specific heuristic knowledge, won't know that it's not that important. Where
as if Host/Service Z is down the world just came to an end.

To accomplish this we created a service that uses the check_cluster plugin
to check the status of a cluster of services. We'll call this service
cluster1.  Then (here is the creative part) we added another service that
checked the status of the cluster1 service plus the status of several
mission critical services using check_cluster.

In other words:

1. A service (cluster1) that checks a cluster of services using
check_cluster with normal warning and critical values (i.e. 4 and 7
respectively).
2. A service (master cluster) that checks cluster1 and several mission
critical services using check_cluster. Here the warning and critical values
are set to 1. The idea with the warning and critical values being set to 1
is that if any of the services in this "cluster" aren't running we have a
mission critical failure.  This includes the cluster1's status. If cluster1
is critical then the System is critical.

I'd like to take credit for the idea, but it was my bosses, I was going to
write a wrapper plugin based on check_cluster to do the same thing.
 

I created a "fictitious" Host and Hostgroup (with the name of the System) to
run the services under

Sincerely,

Daniel G. Rusch

Sincerely,

Daniel G. Rusch
 


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf




More information about the Users mailing list