Howto monitor intermittent clients

Thomas Guyot-Sionnest dermoth at aei.ca
Sat Jan 19 22:25:30 CET 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 19/01/08 05:42 AM, Hugo van der Kooij wrote:
> Lars Stavholm wrote:
> | Hi All,
> |
> | I know this is a bit of an odd one, but still:
> |
> | I'm looking for ways of monitoring intermittent
> | clients, i.e. client computers that are not always
> | there, like laptops that come and go or similar,
> | clients that gets switched off after work hours.
> |
> | Ideally, without installing anything on the clients,
> | so that sort of rules out passive checking. Instead,
> | I would like to use active checking (e.g. using SNMP),
> | but configured in a way so that when a host is unreachable
> | or down, that's OK, and the services needn't be checked.
> | Whereas if the client host is on line, all defined
> | services should be checked and issue alarms and so on
> | if check results in CRITICAL or WARNING, as per usual.
> 
> Make sure you can ping them. Then setup a service to do that at a
> relative high rate. And make all other services dependent on that ping
> service. Then say you do not care about unreachable services.
> 
> Just make sure that ping reaches a hard state before any service can
> reach a hard state.

You don't even need that. Obviously you won't need to know then these
systems are down, so just set the "notification_options" to "n" in the
host setup and you'll never get paged when the host is down.

Nagios check logic is to make sure a host is up before sending a service
alert. If the host is down, you get HOST DOWN notification instead of
any service alert, and obviously id you disable host notifications
you'll get nothing.

Take note of the following:

If you need to monitor many hosts, you will likely be more comfortable
with Nagios 3 as it can schedule host checks instead of running them in
serial, which can be a problem if many hosts are down.

Ideally use check_icmp instead of check_ping and fine-tune the
parameters so that it returns within one second on dead hosts. This will
make the host check be performed much faster. This is most useful in
Nagios 2 due to the limitation mentioned above.

The tricky part is host coming back up. On some systems the network
comes up before all services are up so you'll get notifications for
these services. There's no easy way around that, but personally I'd look
into event handlers.

You could set-up an event handler for hosts and:

- - When a "volatile" host goes HARD DOWN, disable notifications for all
services by sending the appropriate command to the host.
- - When a "volatile" host goes HARD UP, sleep 1 or 2 minutes then enable
notifications for all services. You will probably have to fork before
any sleep in your event handler to avoid blocking Nagios...

References:
http://nagios.sourceforge.net/docs/2_0/notifications.html
http://nagios.sourceforge.net/docs/2_0/checkscheduling.html
http://nagios.sourceforge.net/docs/2_0/eventhandlers.html
http://www.nagios.org/developerinfo/externalcommands/commandlist.php

Thomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHkmrK6dZ+Kt5BchYRAmtUAKCYE4Zu30G7TM10yUXz9oHyZ0HGWACg6s0Y
rbSKYe/x2Q/oyU+Riz7GVyQ=
=jaja
-----END PGP SIGNATURE-----

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list