More on notifications and reboot monitoring

Joe Rhett jrhett at meer.net
Sat Jan 8 23:53:26 CET 2005


On Wed, Jan 05, 2005 at 05:48:50PM -0500, Carson Gaspar wrote:
> I think I have discovered the cause of all of my problems:
> 
> Notifications are only ever triggered by check results
> 
> Life would have been so much easier if that were documented. 

The first statement in the documentation of notifications is:

	When Do Notifications Occur?

	The decision to send out notifications is made in the service check and
	host check logic. Host and service notifications occur in the following
	instances...

	    * When a hard state change occurs.

So it both tells you that notifications are part of the check logic, and it
tells you that notifications can only occur when there is a state change.
State can only change as the result of a check.

So I get that you didn't put grasp the obvious, but it's a bit of a stretch
to say that this isn't documented.

> environment, which is passive only for scalability reasons, if a host goes 
> down and stays down the only checks that will ever trigger notifications 
> are pings (as they run centrally) and freshness checks.
> 
> So in order to do reboot monitoring, my choices are limited (without 
> writing active agents). I _think_ this should work - comments?
> 
> - On shutdown, start 30 minute (tweak to taste) scheduled downtime for Ping 
> (so ping won't whine about the rebooting host being down)
> - On shutdown, send a passive Reboot_Down CRIT, but Reboot_Down doesn't 
> notify anyone
> - On startup, send a passive Reboot_Up CRIT. Reboot_Up depends on 
> Reboot_Down, so if the server was shut down cleanly, no notification will 
> go out.
> - On startup, send a passive Reboot_Up OK followed by a Reboot_Down OK
> - Reboot_Up and Reboot_Down have freshness checks disabled.
 
I'm sure you can make this work, but it's definitely the long way around.

I'd re-examine your prohibition against active checks, because you're 
effectively writing active monitoring support, and it requires both more
extra code/complexity and more resources than just using normal ping checks
and scheduled downtime.

-- 
Joe Rhett
Senior Geek
Meer.net


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list