Nagios Sending Notifications to Contacts not in Escalation Config (Bug?)

Rai Ricafrente maillist at ricafrente.com
Tue Nov 15 02:56:46 CET 2011


Hi guys!

I am now officially baffled on how Nagios handles service escalations and
notifications. I'm using Nagios 3.2.3 on SLES 10 SP3 and my current setup
is this:

service_escalation.cfg:

define serviceescalation {
       service_description     http_80
       host_name               apache02
       first_notification      1
       last_notification       5
       notification_interval   60
       escalation_period       Office_Hours
       contact_groups          unix-sms, dba-email, dev-email
}

define serviceescalation {
       service_description     http_80
       host_name               apache02
       first_notification      6
       last_notification       8
       notification_interval   90
       escalation_period       Office_Hours
       contact_groups          unix-sms, dba-email, dev-email,
unix-supervisor, dev-supervisor
}

define serviceescalation {
       service_description     http_80
       host_name              apache02
       first_notification      1
       last_notification       0
       notification_interval   60
       escalation_period       24x7
       contact_groups          unix-admins-email
}

The users defined in the service_escalation.cfg have their contacts.cfg
configured like this:

define contact{
        contact_name                            unix-sms
        alias                                   Team UNIX
        host_notification_period                Early_Morning
        service_notification_period            Early_Morning
        host_notification_options               u,d,r
        service_notification_options            w,c,u,r
        host_notification_commands              host-notify-by-epager
        service_notification_commands           notify-by-epager
        email                                   unix at email.org
}

define contact{
        contact_name                            unix-supervisor
        alias                                   Team UNIX Supervisor
        host_notification_period                Early_Morning
        service_notification_period            Early_Morning
        host_notification_options               u,d,r
        service_notification_options            w,c,u,r
        host_notification_commands              host-notify-by-epager
        service_notification_commands           notify-by-epager
        email                                   unixsupervisor at email.org
}

timeperiod.cfg looks like this:

define timeperiod{
        timeperiod_name         Office_Hours
        alias                   Office_Hours
        sunday                  09:00-20:00
        monday                  09:00-20:00
        tuesday                 09:00-20:00
        wednesday               09:00-20:00
        thursday                09:00-20:00
        friday                  09:00-20:00
        saturday                09:00-20:00
}

define timeperiod{
        timeperiod_name         Early_Morning
        alias                   Early_Morning
        sunday                  07:00-22:10
        monday                  07:00-22:10
        tuesday                 07:00-22:10
        wednesday               07:00-22:10
        thursday                07:00-22:10
        friday                  07:00-22:10
        saturday                07:00-22:10
}

With these configurations in place, http_80 service goes down at 10pm every
night (scheduled downtime). I am expecting that notifications starting from
10pm onwards will go *only* to unix-admins-email because of the
service_escalation.cfg file. And it happily did, at least for the critical
notifications.

Now the fun part comes in. The recovery notification was sent to the
unix-sms, dba-email, dev-email, unix-supervisor, dev-supervisor groups at
7:03am, when it returned to OK status, which is weird because the critical
notifications from 10pm to 6am (next day) was sent only and only to the
unix-admins-email group.

Plus, I read from the Nagios docs that it will not send recovery
notifications to those who did not receive the critical/warning/unknown
notifications in the first place.

So my questions are:
Why did Nagios send the recovery alert to the supervisors, who did not know
that the service was down in the first place because they did not receive
the critical alert?
Did Nagios took their defined timeperiods into consideration when it send
the recovery alert?

TIA!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20111115/02485783/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
RSA(R) Conference 2012
Save $700 by Nov 18
Register now
http://p.sf.net/sfu/rsa-sfdev2dev1
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list