Howto monitor intermittent clients

Lars Stavholm stava at telcotec.se
Thu Jan 24 20:04:39 CET 2008


Thomas Guyot-Sionnest wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> On 19/01/08 05:42 AM, Hugo van der Kooij wrote:
>> Lars Stavholm wrote:
>> | Hi All,
>> |
>> | I know this is a bit of an odd one, but still:
>> |
>> | I'm looking for ways of monitoring intermittent
>> | clients, i.e. client computers that are not always
>> | there, like laptops that come and go or similar,
>> | clients that gets switched off after work hours.
>> |
>> | Ideally, without installing anything on the clients,
>> | so that sort of rules out passive checking. Instead,
>> | I would like to use active checking (e.g. using SNMP),
>> | but configured in a way so that when a host is unreachable
>> | or down, that's OK, and the services needn't be checked.
>> | Whereas if the client host is on line, all defined
>> | services should be checked and issue alarms and so on
>> | if check results in CRITICAL or WARNING, as per usual.
>>
>> Make sure you can ping them. Then setup a service to do that at a
>> relative high rate. And make all other services dependent on that ping
>> service. Then say you do not care about unreachable services.
>>
>> Just make sure that ping reaches a hard state before any service can
>> reach a hard state.
> 
> You don't even need that. Obviously you won't need to know then these
> systems are down, so just set the "notification_options" to "n" in the
> host setup and you'll never get paged when the host is down.

Good one. Implemented.

> Nagios check logic is to make sure a host is up before sending a service
> alert. If the host is down, you get HOST DOWN notification instead of
> any service alert, and obviously id you disable host notifications
> you'll get nothing.

Very good.

> Take note of the following:
> 
> If you need to monitor many hosts, you will likely be more comfortable
> with Nagios 3 as it can schedule host checks instead of running them in
> serial, which can be a problem if many hosts are down.

I'm on latest 3.x with just a few hosts.

> Ideally use check_icmp instead of check_ping and fine-tune the
> parameters so that it returns within one second on dead hosts. This will
> make the host check be performed much faster. This is most useful in
> Nagios 2 due to the limitation mentioned above.

I wouldn't know how to tweak it, but I'm trying "check_icmp -n 1"
at the moment, and it seems to take about 1.08 to 1.15 seconds.
Good enough maybe? Or can you give any further advice on tweaking
check_icmp?

> The tricky part is host coming back up. On some systems the network
> comes up before all services are up so you'll get notifications for
> these services. There's no easy way around that, but personally I'd look
> into event handlers.

Seems like a good idea. I have a feeling that
the new "Adaptive Monitoring" might be useful.
I will try that in the next step of testing.

Thanks for your help, I'll report back to this
list on my progress.

Thanks
/Lars

> You could set-up an event handler for hosts and:
> 
> - - When a "volatile" host goes HARD DOWN, disable notifications for all
> services by sending the appropriate command to the host.
> - - When a "volatile" host goes HARD UP, sleep 1 or 2 minutes then enable
> notifications for all services. You will probably have to fork before
> any sleep in your event handler to avoid blocking Nagios...
> 
> References:
> http://nagios.sourceforge.net/docs/2_0/notifications.html
> http://nagios.sourceforge.net/docs/2_0/checkscheduling.html
> http://nagios.sourceforge.net/docs/2_0/eventhandlers.html
> http://www.nagios.org/developerinfo/externalcommands/commandlist.php
> 
> Thomas
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.6 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
> 
> iD8DBQFHkmrK6dZ+Kt5BchYRAmtUAKCYE4Zu30G7TM10yUXz9oHyZ0HGWACg6s0Y
> rbSKYe/x2Q/oyU+Riz7GVyQ=
> =jaja
> -----END PGP SIGNATURE-----
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2008.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list