Best way to monitor application clusters

Sébastien Barbereau barbereau at gmail.com
Mon Sep 24 18:38:28 CEST 2007


In similar cases I've always used  the NDO database to query the status of a
group of things.

There are also some other options check_multi for example (
http://my-plugin.de/wiki/doku.php/check_multi)

of course the nagios documentation may help you on this:
http://nagios.sourceforge.net/docs/2_0/clusters.html

Concerning the "simple login web page", I would again suggest you to have a
look at NDO. Creating a simple PHP/Python/Perl/ASP page to query the
database is not complicated and would allow you to create custom displays
for specific usages.


On 9/24/07, Paul Weaver <paul.weaver at bbc.co.uk> wrote:
>
>  I've recently started using nagios in our development environment, and
> have knocked a few plugins for some of our programs (i.e. monitor a log on
> a remote server to make sure it's growing, but not growing too fast or too
> slow, or jumbo pings between two remote machines), which is very impressive.
>
> One thing I would like to monitor is a group of hosts/services, and flag a
> warning if x% are not available, and a critical if y% are offline. A common
> example would be checking DNS services. If you have 4 DNS servers, you don't
> want to be woken up at 3AM if one falls offline, but if 3 are offline you
> would, and if 4 are offline you want an APB. You still want to see the
> servers are offline though on a webpage, and possible a notification in work
> hours.
>
> I'm aware of host/service groups, being one way of doing it, however I'm
> unsure if notifications can be set based on % of hosts/services available in
> a group.
>
> Another way would be a "virtual host", with a custom "check_host_alive"
> which checks all hosts in a collection, and returns an OK/critical/warning
> based on the number of failures, and likewise with "virtual services". The
> original hosts could then be monitored separately, or even not at all.
>
> For example, a service I would like to check is whether 3 mysql databases
> are in sync with each other. I currently have a web page that compares the
> log positions. It seems to me that logically the service should run on the
> mysql boxes, however I only want it running on
>
> Another example would be I have a piece of java software (call it "A")
> that must run on at least one of 4 machines, and preferably on 2 of them. I
> don't care which machine it's on, but if it's not running I want to be
> notified in red lights.
>
> I could have a simple "virtual service A", which would critical on 0, warn
> on 1 and OK on 2 or more.
> This would be attached to "virtual host A", which would critical on 0,
> warn on 1 and OK on 2 or more of the servers that the service runs on.
>
> I'd also like a "simple" login to the web page which would only display
> the "clusters" of services/hosts, rather than the total view, which would
> allow our support engineers to easilly see real problems, and allow
> management to sleep hapilly with lots of green lights.
>
> I must admit I'm leaning to the virtual host/service thing, but I was
> wondering if there's a standard/better way of monitoring these kind of
> things?
>
> Thanks
>
> http://www.bbc.co.uk
> This e-mail (and any attachments) is confidential and may contain personal
> views which are not the views of the BBC unless specifically stated.
> If you have received it in error, please delete it from your system.
> Do not use, copy or disclose the information in any way nor act in
> reliance on it and notify the sender immediately.
> Please note that the BBC monitors e-mails sent or received.
> Further communication will signify your consent to this.
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20070924/2457e931/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list