More on notifications and reboot monitoring

Carson Gaspar carson+nagiosusers at taltos.org
Wed Jan 5 23:48:50 CET 2005


I think I have discovered the cause of all of my problems:

Notifications are only ever triggered by check results

Life would have been so much easier if that were documented. So in my 
environment, which is passive only for scalability reasons, if a host goes 
down and stays down the only checks that will ever trigger notifications 
are pings (as they run centrally) and freshness checks.

So in order to do reboot monitoring, my choices are limited (without 
writing active agents). I _think_ this should work - comments?

- On shutdown, start 30 minute (tweak to taste) scheduled downtime for Ping 
(so ping won't whine about the rebooting host being down)
- On shutdown, send a passive Reboot_Down CRIT, but Reboot_Down doesn't 
notify anyone
- On startup, send a passive Reboot_Up CRIT. Reboot_Up depends on 
Reboot_Down, so if the server was shut down cleanly, no notification will 
go out.
- On startup, send a passive Reboot_Up OK followed by a Reboot_Down OK
- Reboot_Up and Reboot_Down have freshness checks disabled.

I'd cancel the downtime if I could on startup, but there's no good way to 
get the downtime ID remotely. I could write an agent that runs on the 
nagios server if I decided I really cared.

So on a normal reboot, no alarms. On a reboot that never comes back, ping 
will alarm after the downtime ends. On an abnormal reboot, Reboot_Up will 
alarm (as Reboot_Down will be OK (or Unknown)).

-- 
Carson



-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list