Notification interval & escalation overlapping

Ethan Galstad nagios at nagios.org
Sun May 21 21:34:54 CEST 2006


Dirk De Coninck wrote:
> Hi all,
> 
> First I would like to thank the Nagios developers for providing us this 
> wonderful tool and sharing it with the community.
> 
> I am using Nagios 2.2 and there seems to be a bug when using escalation 
> overlapping with different notification intervals.
> What I want to achieve is this:
> For all service and hosts notifications an email is to be sent to the 
> system administrators list with a notification interval of 20 minutes.
> When a critical or down notification (not for warnings) is not 
> acknowledged within 20 minutes an email should be sent to the managers 
> list.
> The managers only want to get 1 notification to report the critical or 
> down state and a recovery notification whenever the status is recovered 
> but only for recoveries that they initially got a notification for.
> 
> To achieve this I created the following escalation definition templates:
> define serviceescalation{
>        name                    mgmt
>        first_notification      2
>        last_notification       3  ; if I put 2 here, they never get the 
> recovery notification
>        contact_groups          mgmt
>        notification_interval   0  ; if I put 20 here, they get 2 
> critical notifications and no recovery
>        escalation_period       24x7
>        escalation_options      u,c,r
>        register                0
>        }
> 
> define serviceescalation{
>        name                    admins
>        first_notification      3
>        last_notification       0
>        contact_groups          admins
>        notification_interval   20
>        escalation_period       24x7
>        escalation_options      w,u,c,r
>        register                0
>        }
> 
> define hostescalation{
>        name                    host-mgmt
>        first_notification      2
>        last_notification       3
>        contact_groups          mgmt
>        notification_interval   0
>        escalation_period       24x7
>        escalation_options      d,u,r
>        register                0
>        }
> 
> define hostescalation{
>        name                    host-admins
>        first_notification      3
>        last_notification       0
>        contact_groups          admins
>        notification_interval   20
>        escalation_period       24x7
>        escalation_options      d,u,r
>        register                0
>        }
> 
> All hosts and services have the admins as contact group with a 20 
> minutes notification interval.
> First I tried only adding an escalation entry for the mgmt group (worked 
> fine in version 1.3):
> define serviceescalation{
>        use                             mgmt
>        hostgroup_name                 internet-servers
>        service_description             PING
>        }
> 
> define hostescalation{
>        use                     host-mgmt
>        hostgroup_name          internet-servers
>        }
> 
> The result is that I get the first notification to the admins and 20 
> minutes later I get the escalation notification to the mgmt and then 
> nothing anymore.
> 
> Then I tried adding an escalation for the admins:
> define serviceescalation{
>        use                             admins
>        hostgroup_name                 internet-servers
>        service_description             PING
>        }
> 
> define hostescalation{
>        use                     admins
>        hostgroup_name          internet-servers
>        }
> 
> But no result either.
> 
> The only way I can almost make it work is by changing the management 
> template notification interval to 20 and setting the last notification 
> to 2. The admins get the reminders this way, but the mgmt never gets the 
> recovery notification.
> 
> Just a thought, but what would make everything a lot easier is the 
> possibility of defining first_notification, last_notification,  
> notification_interval, escalation_period and escalation_options in the 
> contacts definition (contacts.cfg).
> Since it is advised to work with templates to make things more easy, why 
> don't the service definitions inherit most of the parameters like 
> contact groups, escalations, notification interval... from the host 
> definitions unless specified in the service definition?
> Do I submit this as a feature requests?
> 
> Sorry for this long email, I wanted to supply all relevant information.
> Thanks for any help you can give me to make this work.
> 
> Kind regards,
> Dirk.
> 

Nagios will only send recovery notifications to the contact(s) that last 
received a problem notification.  This means that if the problem 
persists and is escalation past the management team, the admins will 
receive recovery alerts, but the managers will not.

One possible solution would to include the managers in the contact 
groups that get notified for subsequent (>2) alerts.  Do this by 
creating duplicate contact definitions for the managers and specifying 
only "r" for the notification options.  This will mean that the managers 
only receive recovery alerts, no matter how many problem alerts get sent 
out to the admins.


Ethan Galstad,
Nagios Developer
---
Email: nagios at nagios.org
Website: http://www.nagios.org


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642




More information about the Developers mailing list