[PATCH] - 3.0.3: only send out a service recovery escalation if a service is recovering from a non-OK state listed in the escalation or only 'r' is specified as an escalation option

Thomas Guyot-Sionnest dermoth at aei.ca
Mon Jul 20 15:38:11 CEST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 19/07/09 08:13 AM, Max wrote:
> Hi,
> 
> We make heavy use of escalations in our Nagios installation; a
> behavior of Nagios that was annoying some of our users was that if
> they specified 'r' in an escalation they would receive recovery
> notifications for services recovering from states that were not
> mentioned in the escalation definition (escalation_options field).
> 
> Example:
> 
> define serviceescalation {
>     hostgroup_name          nagios_hosts
>     contact_groups          admins
>     first_notification      1     ; notify right away (1st notification)
>     last_notification       0     ; notify until service changes state
>     notification_interval   240   ; default to 4 hour re-notify
>     escalation_period       24x7
>    service_description     .*
>     escalation_options      c,r
>     register                0
> }
> 
> In this case we would receive recovery service escalations for
> services returning to OK from WARNING when we never were sent a
> WARNING escalation notification because it is not a state listed in
> escalation_options.  With the patch a recovery is only sent if the
> previous state of the service was CRITICAL and an escalation
> notification had been sent out because the escalation rule had
> previously matched on CRITICAL.
> 
> I have added code to
> 
> int is_valid_escalation_for_service_notification(service *svc,
> serviceescalation *se, int options)
> 
> in notifications.c to add this logic:
> 
> If escalation is of type recovery:
> * If the escalation rule does not specify recovery, return FALSE
> (original logic)
> * If the escalation rule specifies one or more problem states
> ** If this is the first hit on this escalation rule, return FALSE as
> no escalation was sent previously for a problem with this rule
> ** If the previous problem state of the service matching the
> escalation is NOT listed in the escalation, return FALSE as we did not
> previously send out an problem escalation for the state this service
> is recovering from
> 
> This also allows the user to specify JUST 'r' in the
> escalation_options field and have the escalation rule work as expected
> .. recovery notifications will be sent as they match the rule
> irregardless of any previous problem states.

So according to the example above if the service do not recover
immediately but return UNKNOWN or WARNING before recovery, you will not
receive any recovery notification.

A proper patch would have to record all non-ok states during a failure
and use it as a base for determining whenever to send a recovery
notification or no.

- --
Thomas
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFKZHND6dZ+Kt5BchYRAsjiAJ0dE/NVycw/XkjPdRinUYXEf7azGwCff8U+
eNX5nPLtfQlAhlSC0i8bRoM=
=BAKI
-----END PGP SIGNATURE-----

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge




More information about the Developers mailing list