Recovery notifications after escalations

Andreas Ericsson ae at op5.se
Tue Jun 9 15:47:21 CEST 2009


Marcus Rejås wrote:
> On 06/09 12:06, Ulf Karlsson wrote:
>> Hi,
>>
>> We have a a situation here where we would like notify the on-call
>> group after 60 minutes and the support group after 240 minutes. If
>> services go down and then recover, everyone who has received a
>> notification of a host problem should also receive the recovery
>> notification. See the configuration below.
>>
>> Now, our problem is that when the second escalation has been activated
>> and the support group has received the notification, only the support
>> group will receive the recovery notification - the on-call group will
>> never see the recovery notification.
>>
>> We do not want to send out multiple notifications to the on-call group
>> four the same issue since they then would be spammed by Nagios
>> unnecessarily.
> 
> I don't (at least not yet) have a good answer. But maybe I can put some ideas
> in your head.
> 
> My first thought is that if they want the recovery notification maybe they
> would not mind the extra one either. The extra one actually tells them that
> the issue was escalated and might be useful information. If they don't want
> the issue to escalate, they should acknowledge it (sticky).
> 
> In order do fix it to work like you asks I have two suggestions. None of them
> is good.
> 
> If you do not have that many contacts, create an additional one for each
> member in the on-call with only recovery-alerts and put them in a group, e.g.
> on-call-recovery and escalate to that one. They will now get the recovery
> notification.
> 

I don't think they will. There are checks to make sure recovery notifications
are only sent to contacts who have received the previous problem notification.

> An other alternative is to modify your notification-command to take notice of
> the macros $SERVICENOTIFICATIONNUMBER$ and maybe $HOSTNOTIFICATIONNUMBER$ and
> build the logic you wish. Make sure to do it right so you don't miss
> important notifications.
> 
> But, as I said, I don't like any of the ideas. There are very smart people on
> this list and someone will probably give you some more advice.
> 

Sending a patch to make sure each problem object in the Nagios core contains a
concatenated list of "normal" and escalated contacts would be favourite, since
that would mean everyone who received the problem notification will also get
the recovery notification. This would best be implemented by building a linked
list with only unique elements to operate on. The list should probably contain
a marker to mention which contacts were added from the escalation, so the
original contacts do not get notified if they don't want to get the escalated
notifications.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list