Sane (ab)use of escalations?

Wil Cooley wcooley at nakedape.cc
Mon Aug 16 00:28:27 CEST 2010


We are in the process of rebuilding our Nagios setup, which is an
ancient 1.x installation, to 3.2.1.

We want to re-implement the following policy for notifications:
 * During the work day, everyone in contactgroup gets notifications
 * Outside of working hours,
   * On-call person gets paged immediately
   * Everyone in contactgroup gets paged after 5 notifications

This is currently implemented by having 2 contact objects for every
user, one for work-hours and one for off-hours, each with appropriate
*_notification_periods. Notifications go to all N*2 contacts and the
individual contact's notification periods handles whether or not to
page.

I believe we can implement the policy using escalations. First, all
hosts & services have a contact that delivers notifications to a shared
IMAP folder, to get around the requirement of these objects having
contacts.

Next, to implement the work-hours piece, a service escalation such as:

define serviceescalation {
  hostgroup_name 	unix-hosts
  service_description	*
  contact_groups 	unix-admins
  first_notification 	1
  last_notification 	0
  escalation_period 	work-hours
}

And a corresponding hostescalation.

To implement the off-hours piece, escalations such as:

define serviceescalation {
  hostgroup_name	unix-hosts
  contact_groups	on-call
  service_description 	*
  first_notification 	1
  last_notification 	0
  escalation_period 	off-hours
}

define serviceescalation {
  hostgroup_name	unix-hosts
  contact_groups	unix-admins
  service_description 	*
  first_notification	5
  last_notification	0
  escalation_options	c
  escalation_period	off-hours
}

With this setup, all "real" notifications happen through these
escalations, in effect using them as a "servicenotification" object,
which kinda seems better from a data-normalization perspective (but
IANADBA).

Does this seem like it should work as I expect? Can anyone see any
problems? I have not actually tried this, so I don't know if it works.

As an aside, does anyone have any ideas of how to test this kind of
thing? Perhaps submitting passive check results and having timeperiods
of "oddminutes" and "evenminutes"?

Wil


------------------------------------------------------------------------------
This SF.net email is sponsored by 

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev 
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list