Unexpected Service Escalation Behavior

Narum, Kyle Kyle_Narum at eLoyalty.com
Mon Dec 5 20:09:05 CET 2005


All,
 
I'm running into an interesting behavior when using Service Escalation
rules to control alerts for errors of type UNKNOWN.  Basically, for
certain Service Checks, we don't want our Help Desk to get notified if
the check returns UNKNOWN, however we do want them to be notified of any
CRITICAL or WARNING.  So I've setup an escalation rule for that service
which specifies only the "w,c,r" options.

This works fine for sending out the initial UNKNOWN alert.  (ie it only
goes to a secondary email address which acts as a general log for all
alerts.)  However, when the Service Check recovers, both of our contact
groups get notified.  (Both the contact group that received the UNKNOWN
error, as well as the contact group that did not receive the
error...thereby receiving an "OK/RECOVERY" message, when they never
received a "UNKNOWN" alert message.)

Here's detail on the configuration I have:

Service A defined as:
--------------------
define service {
	service_description             Service A
	check_period                    24x7
	max_check_attempts              3
	normal_check_interval           5
	retry_check_interval            1
	active_checks_enabled           1
	passive_checks_enabled          1
	parallelize_check               1
	obsess_over_service             1
	check_freshness                 1
	notifications_enabled           1
	notification_options            w,c,r,u,f
	notification_interval           5
	notification_period             24x7
	event_handler_enabled           1
	flap_detection_enabled          1
	process_perf_data               1
	retain_status_information       1
	retain_nonstatus_information    1
	contact_groups                  Contact Group A, Contact Group B
}
Service Escalation for Service A defined as:
-------------------------------------------
define serviceescalation{
	service_description		Service A
	host					Host A
	contact_groups			Contact Group B
	first_notification		1
	last_notification			0
	notification_interval		1440
	escalation_period			24x7
	escalation_options		w,c,r
}

Expected Behavior when Service A goes into an UNKNOWN state and then
goes back to an OK state
	* Contact Group A gets notified notified of the UNNOWN alert
	* Contact Group A gets notified of the OK/RECOVERY

Actual Behavior when Service A goes into an UNKNOWN state and then goes
back to an OK state
	* Contact Group A gets notified of the UNKNOWN alert
	* Contact Group A AND Contact Group B both get notified of the
OK/RECOVERY

I've read thru all the documentation I can find regarding the use of
escalation rules, and I'm fairly sure this isn't the intended behavior.
Has anyone encountered a similar issue, or am I missing something in my
configuration?

System: Nagios 2.04b / OS Fedora Core 4

Thanks,
Kyle 



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list