More on notifications and reboot monitoring

Andreas Ericsson ae at op5.se
Thu Jan 6 14:51:20 CET 2005


Carson Gaspar wrote:
> I think I have discovered the cause of all of my problems:
> 
> Notifications are only ever triggered by check results
> 
> Life would have been so much easier if that were documented. So in my 
> environment, which is passive only for scalability reasons, if a host 
> goes down and stays down the only checks that will ever trigger 
> notifications are pings (as they run centrally) and freshness checks.
> 
> So in order to do reboot monitoring, my choices are limited (without 
> writing active agents). I _think_ this should work - comments?
> 
> - On shutdown, start 30 minute (tweak to taste) scheduled downtime for 
> Ping (so ping won't whine about the rebooting host being down)
> - On shutdown, send a passive Reboot_Down CRIT, but Reboot_Down doesn't 
> notify anyone
> - On startup, send a passive Reboot_Up CRIT. Reboot_Up depends on 
> Reboot_Down, so if the server was shut down cleanly, no notification 
> will go out.
> - On startup, send a passive Reboot_Up OK followed by a Reboot_Down OK
> - Reboot_Up and Reboot_Down have freshness checks disabled.
> 
> I'd cancel the downtime if I could on startup, but there's no good way 
> to get the downtime ID remotely. I could write an agent that runs on the 
> nagios server if I decided I really cared.
> 
> So on a normal reboot, no alarms. On a reboot that never comes back, 
> ping will alarm after the downtime ends. On an abnormal reboot, 
> Reboot_Up will alarm (as Reboot_Down will be OK (or Unknown)).
> 

Ehrm. The idea of scheduled downtime is to do this sort of thing. If you 
want to add a script submitting a 5 minute (or something) downtime 
whenever you run reboot, then by all means feel free. If you make it 
clean I'm sure lots of other users would be interested. I don't think 
it's a very good idea to keep that logic in the Nagios daemon though, as 
it can never possibly guess if a host has been shut down or crashed, so 
I don't quite see the point of this email. Care to clarify?

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list