Service Escalation Timing Issue

Assaf Flatto nagios at flatto.net
Tue Jun 22 17:31:46 CEST 2010


Jeff Tillotson wrote:
> On Tue, Jun 22, 2010 at 05:53:45AM -0400, Assaf Flatto wrote:
>   
>> Tillotson, Jeff wrote:
>>     
>>> I've got a service that I've set up with the following requirements.  E-mail a certain group after service has been down for 5 minutes.  page when service has been down for 10 minutes.  Then, page again after 30 minutes.  I'm fairly certain my problem is with notification_interval in the service_escalation and that I'm misunderstanding this from the documentation:
>>> "When defining notification escalations, it is important to keep in mind that any contact groups that were members of "lower" escalations (i.e. those with lower notification number ranges) should also be included in "higher" escalation definitions. This should be done to ensure that anyone who gets notified of a problem continues to get notified as the problem is escalated."
>>>   
>>>
>>> Following are the configuration options (I've snipped some options down):
>>>
>>> Nagios.cfg:
>>> interval_length=1  (One second)
>>>
>>> Template:
>>>
>>> define service{
>>>         name                            distrib-nevent-graph
>>>         check_period                    24x7
>>>         max_check_attempts              2
>>>         contact_groups                  no-one
>>>         notification_options            w,u,c,r 
>>>         notification_interval           60
>>>         notification_period             24x7
>>>         register                        0      
>>>         }
>>>
>>> Service:
>>> define service{
>>>         use                                       distrib-nevent-graph
>>>         hostgroup_name                  location-v7apache
>>>         service_description              v7apache-check
>>>         }
>>>
>>> Service Escalation:
>>> define serviceescalation {
>>>     hostgroup_name            location-v7apache
>>>     service_description       v7apache-check
>>>     first_notification        5
>>>     last_notification         0
>>>     notification_interval     1800
>>>     contact_groups            nopage, core
>>> }
>>> define serviceescalation {
>>>     hostgroup_name            location-v7apache
>>>     service_description       v7apache-check
>>>     first_notification        10
>>>     last_notification         0
>>>     notification_interval     1800
>>>     contact_groups            page, nopage, core
>>> }
>>>
>>>
>>>   
>>>       
>> If i am reading this right , you have your first notification sent after 
>> 2.5 hours  .
>>
>> 1800sec = 30 minutes  x 5 ( first notification)  = 2.5 hours.
>>
>> you might want to change the interval to 300 .
>>
>>     
>
> Thanks for your response.
>
> If I change the interval to 300, than core and nopage get the
> notification every 5 minutes after the 5th notification.  Then I page
> won't get the first alert until 30 minutes after the host is down
> (5 at 1min interval + 5 at 5min interval).  What I really want is nopage
> and core to get notifications after service has been down for 5 minutes and
> than 30 minutes after.  page to get notifications after service has been
> down for 10 minutes and 30 minutes after.
>
> I almost think the following will provide what I want but the
> documentation section I posted in my original post makes me think this
> is a bad idea.
>
>
> define serviceescalation {
>       hostgroup_name            location-v7apache
>       service_description       v7apache-check
>       first_notification        5
>       last_notification         0
>       notification_interval     1800
>       contact_groups            nopage, core
> }
> define serviceescalation {
>       hostgroup_name            location-v7apache
>       service_description       v7apache-check
>       first_notification        10
>       last_notification         0
>       notification_interval     1800
>       contact_groups            page
> }
>
> --Jeff
>
>   
I think this is what you need :


define serviceescalation {
      hostgroup_name            location-v7apache
      service_description       v7apache-check
      first_notification        1 #(after 5 minutes)
      last_notification         0
      notification_interval     300
      contact_groups            nopage, core
}

define serviceescalation {
      hostgroup_name            location-v7apache
      service_description       v7apache-check
      first_notification        6 # (5x 5minutes = 25 after the first notification)
      last_notification         0
      notification_interval     300
      contact_groups            nopage, core
}


define serviceescalation {
      hostgroup_name            location-v7apache
      service_description       v7apache-check
      first_notification        2 #(2x5 minutes)
      last_notification         0
      notification_interval     300
      contact_groups            page
}


define serviceescalation {
      hostgroup_name            location-v7apache
      service_description       v7apache-check
      first_notification        8 # (6 x 5minutes = 30 after the first notification)
      last_notification         0
      notification_interval     300
      contact_groups            page
}


-- 
Never,Ever Cut A Deal With a Dragon 


I am doing a Charity Bike ride On the 27 of June for the
Capital to Coast Charity. Please help by Donating
http://www.justgiving.com/Lovefilm-capital-to-coast



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list