first_notification_delay for hosts

Mathias Sundman mathias at nilings.se
Thu Dec 1 12:33:55 CET 2005


On Thu, 24 Nov 2005, Andreas Ericsson wrote:

> This patch adds a variable to the host object configuration, 
> first_notification_delay, which causes notifications for a host to be put off 
> until a minimum amount of time has passed.
>
> This is intended to artificially mimic the service notification logic that 
> allows some time to pass between a detected error and the first notification 
> by forcing at least some "sleep-time" between the HARD detection of a downed 
> host and the first notification sent for it.
>
> Because of how notifications are scheduled, this means that no host 
> notifications are sent unless the host has been checked first the 
> max_check_attempts times (run serially), waited until a service (or the host) 
> has been checked again and then, if the host is still down, the notification 
> is sent provided (first_notification_delay * interval_length) seconds has 
> passed.
>
> I did the documentation update. All credits for the code should go to Mathias 
> Sundman, a Sungard employee and also a customer of ours who sent the patch to 
> me for review. I'm forwarding it to the list with his explicit consent. I've 
> tested it and found it to be in good working order.

Ethan, do you think this patch has any chance of making it into Nagios 
2.0?

Just some background why I wrote this patch; Many of the hosts we monitor 
are such that we can accept them to lose network connectivity for some 
time (say 10-30 minutes), but if they go down permanently we want to be 
notified of this.

To achieve this we had to setup a dummy notification group for the 
host, and then use escalations to be notified after a number of 
notification_intervals has elapsed. That solution had a number of 
drawbacks and felt more like a work around than a real solution.

Then I searched the list archive and found a number of other people with 
the same problem as me but no other solution than the escalation method.

So, I decided to but this patch together that works very well for us in 
our production environment atleast...

Cheers // Mathias

-- 
A. Because people read from top to bottom.
Q. Why should I not top-post?
_________________________________________________________
Mathias Sundman              (^)   ASCII Ribbon Campaign
NILINGS AB                    X    NO HTML/RTF in e-mail
Tel: +46-(0)8-666 32 28      / \   NO Word docs in e-mail


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click




More information about the Developers mailing list