nagios on call schedule w/ escalations?

Charlie Reddington charlie.reddington at gmail.com
Thu Oct 2 23:42:41 CEST 2008


Jon thanks. I got things figured out.

I setup 2 sets of contacts with the same users. One was just for the  
regular contact. I setup this group of 'admins' so they are only  
contacted on their oncall schedule.

I then just did nearly exactly as you wrote and made a totally  
seperate set of contacts, that can be contacted 24x7.

I have 2 groups. Admins and Escalations.

Escalations use the second set of 24x7 contacts, and the Admins  
contacts uses the oncall schedule.

Inheritance wasn't really necessary, just the separate groups.

Oh and I made a separate contact template that used the proper contact  
time period.

Thanks again, works perfect.

charlie

On Oct 2, 2008, at 3:13 AM, Jon Angliss wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On Tue, 30 Sep 2008 16:22:16 -0500, Charlie Reddington
> <charlie.reddington at gmail.com> wrote:
>
>> Hi guys / gals,
>>
>> I am working on the final stages of my nagios setup, but I'm entering
>> territory which I haven't been before and can use some guidance.
>
> I'm sure you've probably taken a peek at the "On Call Rotations"
> details in the documentation:
>
>  http://nagios.sourceforge.net/docs/3_0/oncallrotation.html
>
> There are plenty of examples to get a good idea.
>
>> Here's what I'm trying to achieve. We have a team of 3 admins, where
>> we rotate weeks who is on call. Of course, they aren't every other  
>> 3rd
>> week , because of people having vacation time, etc. So some weeks
>> people are on call for 2 weeks, or every 2 weeks, etc.
>>
>> What we'd like is, to have a schedule setup where the primary guy  
>> gets
>> woken up first. But if he doesn't answer his call after an hour, it
>> drops down to the rest of us admins. No matter if your just at home
>> sleeping, or if your on vacation, you get pinged. After that it goes
>> up to our manager.
>
>> I can figure out the setting of people's initial schedule, as I have
>> it looking something like this....
>>
>> # contacts
>>
>> define contact{
>>        contact_name                    user1
>>        use                             generic-contact
>>        alias                           user1
>>        email                           user1
>>        host_notification_period        user1_oncall
>>        service_notfication_period      user1_oncall
>>        }
>>
>> define contact{
>>        contact_name                    user2
>>        use                             generic-contact
>>        alias                           user2
>>        email                           user2
>>        host_notification_period        user2_oncall
>>        service_notfication_period     user2_oncall
>>        }
>>
>> define contact{
>>        contact_name                    user3
>>        use                             generic-contact
>>        alias                           user3
>>        email                           user3
>>        host_notification_period        user3_oncall
>>        service_notfication_period    user3_oncall
>>        }
>> define contact{
>>   contact_name        manager1
>>   use                    generic-contact
>>   email                manager1
>>   }
>>
>> # groiups
>>
>> define contactgroup{
>>   contact_groupname admins
>>   members user1,user2,user3
>> }
>> define contactgroup{
>>   contact_groupname managers
>>   members manager1
>> }
>>
>> # Time periods
>>
>> define timeperiod{
>>        timeperiod_name user1_oncall
>>        Sept 29 - Oct 5 00:00-24:00
>>        Oct 20 - Oct 26 00:00-24:00
>>        Nov 17 - Nov 23 00:00-24:00
>>        Dec 1 - Dec 7 00:00-24:00
>>        Dec 15 - Dec 21 00:00-24:00
>> }
>>
>> define timeperiod{
>>        timeperiod_name user2_oncall
>>        Oct 6 - Oct 12 00:00-24:00
>>        Nov 3 - Nov 9  00:00-24:00
>>        Nov 24 - Nov 30 00:00-24:00
>>        Dec 22 - Dec 23 00:00-24:00
>> }
>>
>> define timeperiod{
>>        timeperiod_name user3_oncall
>>        Oct 13 - Oct 19 00:00-24:00
>>        Oct 27 - Nov 2  00:00-24:00
>>        Nov 10 - Nov 16 00:00-24:00
>>        Dec 8 - Dec 14  00:00-24:00
>> }
>
>> Would / Does escalations trump the initial contacts?
>>
>> # First escalations
>> define serviceescalation{
>>        hostgroup_name          Servers
>>        service_description     *
>>        first_notification      2
>>        last_notification       3
>>        notification_interval   30
>>        contact_groups          admins
>> }
>>
>> # Second escalations
>> define serviceescalation{
>>        hostgroup_name          Servers
>>        service_description     *
>>        first_notification      3
>>        last_notification       8
>>        notification_interval   60
>>        contact_groups          admins,managers
>> }
>>
>> So I know this isn't quite right, as our admins are part of the admin
>> group, but also trying to restrict when they get contacted. So I'm  
>> not
>> really sure how to proceed with this.
>
> You might want to read up on notifications, and serviceescalations,
> too... Looking at the time stuff you've got, what'll happen is at any
> one point, only 1 of the admins will be reachable by notifications at
> any time.  This is because the "timeperiods" stop nagios from sending
> notifications to a user that is outside their timeperiod.  For
> example, a host goes down at 2100 on Oct 15th, only user3 will be
> notified, even after the escalations kick in.  There will be a period
> of 0-3 notifications where user3 is the only recipient.  It'll only
> get to another person when the 3rd notification goes out, and it
> engages the "managers" contact group.
>
> Depending on how many users/admins you're looking at, you could use a
> trick with templating, and inheritence. Keeping your base users as you
> have above, then build escalation users' and groups.
>
> define timeperiod {
>    timeperiod_name    AllTimes
>    alias                All Times
>    sunday            00:00-24:00
>    monday            00:00-24:00
>    tuesday            00:00-24:00
>    wednesday        00:00-24:00
>    thursday        00:00-24:00
>    friday            00:00-24:00
>    saturday        00:00-24:00
> }
>
> define contact {
>    contact_name                disable_times
>    host_notification_period        AllTimes
>    service_notification_period    AllTimes
>    register                        0
> }
>
> define contact{
>         contact_name                    user1
>         use                             generic-contact
>         alias                           user1
>         email                           user1
>         host_notification_period        user1_oncall
>         service_notfication_period      user1_oncall
> }
>
> define contact{
>         contact_name                    user2
>         use                             generic-contact
>         alias                           user2
>         email                           user2
>         host_notification_period        user2_oncall
>         service_notfication_period      user2_oncall
> }
>
> define contact {
>    use                    disable_times,user1
>    contact_name        user1_esc
> }
>
> define contact {
>    user                disable_times,user2
>    contact_name        user2_esc
> }
>
> define contactgroup {
>    contactgroup_name        admins
>    members                user1,user2
> }
>
> define contactgroup {
>    contactgroup_name        admins_esc
>    members                user1_esc,user2_esc
> }
>
> Then your service escalations use admins_esc instead of just admins.
> I've not tested it, but looking at the way inheritence works, you
> should be OK.
>
> - --
> Jon Angliss
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (MingW32) - GPGshell v3.64
>
> iEYEARECAAYFAkjkgqMACgkQK4PoFPj9H3MthQCg4XgD5eNyl190umm7Ew8OouKK
> kCoAoNsRdPjpTMX/tO/eC00ejVb3MjzF
> =XHky
> -----END PGP SIGNATURE-----
>
>
> -------------------------------------------------------------------------
> This SF.Net email is sponsored by the Moblin Your Move Developer's  
> challenge
> Build the coolest Linux based applications with Moblin SDK & win  
> great prizes
> Grand prize is a trip for two to an Open Source event anywhere in  
> the world
> http://moblin-contest.org/redirect.php?banner_id=100&url=/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when  
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list