Nagios escalations - help needed

Gregg Strickland gstrickland at live365.com
Mon Sep 6 05:15:41 CEST 2004


I'm not sure if this is it or not, but if your host was down then nagios
should stop checking the services on that host until the host comes back up.
If that's the case then it would never get to notification #2 and therefore
never escalate the service warning.  Because webdev (e-mail) is in the
hostgroup they would get e-mail that the host was down, but not a page.  

You could put in a hostgroup escalation to make sure they get paged when the
host is down.

Now if you're getting e-mail for the services after the host is down then
everything I said above wouldn't apply ;o)  Anyway... hope it at least helps
think through the problem some.

-greggs

-----Original Message-----
From: nagios-users-admin at lists.sourceforge.net
[mailto:nagios-users-admin at lists.sourceforge.net] On Behalf Of Collins,
Steve
Sent: Sunday, September 05, 2004 7:13 PM
To: nagios-users at lists.sourceforge.net
Cc: linux at lists.samba.org
Subject: [Nagios-users] Nagios escalations - help needed

I have the following need:

1. Monitor several services 24x7 on a group of hosts 2. During workhours,
notify a limited group of users every 30 or 60 minutes (dependent on the
service) by email if any of the services aren't working, and subsequent host
notifications (which works just fine).
3. 24x7, escalate service notifications so that the following occur:
	* on notification 2 and subsequent, SMS our oncall person every 6
hours
	* on notification 2 ONLY, email our main client and our service desk

I get service and host notifications for what I need by email just fine, but
the escalations don't seem to work.  I had a server die on the weekend and
go no SMS.  Below are the (I believe) appropriate bits of my config files.
I'd like to get the bits for webdev and webdev-oncall working.  After that,
I should be able to add in things like management, etc.

I'd greatly appreciate any advice on what and where I've gone wrong (which
is surely the case).  I know that notifications are very config dependent,
and it's just the figuring out of where I've cruelled the config so that
they are mucked up that I need help with.  What I'd initially like to do is
set it all up with a really close set of notification periods so I can test
it all, and then push the periods back to reality.

I'm using Nagios 1.2 and the latest NagMIN for config editing.

Host.cfg (typical entry)
~~~~~~~~~~~~~~~~~~~~~~~~
define host {
    use    generic-host
    host_name    DONKEY
    alias    DONKEY Server
    address    172.16.2.30
    parents    MacquarieRack
    check_command    check-host-alive
    max_check_attempts    3
    notification_interval    60
    notification_period    24x7
    notification_options    d,u,r
}

HostGroup.cfg
~~~~~~~~~~~~~
define hostgroup {
    hostgroup_name    Macquarie_internet_zone
    alias    MCT Internet Zone
    contact_groups    online,webdev
    members    DONKEY (plus several others)
}

Contact.cfg
~~~~~~~~~~~
define contact {
    use    generic-contact
    contact_name    scollins
    alias    Stephen Collins
    email    steve.collins at industry.gov.au
    service_notification_period    24x7
    host_notification_period    24x7
    service_notification_options    w,u,c,r
    host_notification_options    d,u,r
    service_notification_commands    notify-by-email
    host_notification_commands    host-notify-by-email
}
define contact {
    use    generic-contact
    contact_name    servicedesk
    alias    Service Desk
    email    servicedesk at industry.gov.au
    service_notification_period    24x7
    host_notification_period    24x7
    service_notification_options    w,u,c,r
    host_notification_options    d,u,r
    service_notification_commands    notify-by-email
    host_notification_commands    host-notify-by-email
}
define contact {
    use    generic-contact
    contact_name    webdev-oncall
    alias    Web Development Team Oncall Member
    pager    0421054024 at streetdata.com.au
    service_notification_period    nonworkhours
    host_notification_period    nonworkhours
    service_notification_options    w,u,c,r
    host_notification_options    d,u,r
    service_notification_commands    notify-by-epager
    host_notification_commands    host-notify-by-epager
}

Contactgroup.cfg
~~~~~~~~~~~~~~~~
define contactgroup {
    contactgroup_name    webdev
    alias    Web Development Team Staff
    members    brobinson,imacintosh,mwalsh,rbuerckner,scollins,sjanssens
}
define contactgroup {
    contactgroup_name    webdevoncall
    alias    Web Development Team Staff - Oncall
    members    imacintosh-sms,webdev-oncall
}

Service.cfg
~~~~~~~~~~~
define service {
    use    NM-HTTP
    hostgroup_name    Macquarie_internet_zone
    service_description    Check HTTP [MCT]
    contact_groups    webdev
    check_period    24x7
    notification_interval    30
    notification_options    w,u,c,r
    notification_period    24x7
    check_command    check_http_mct
    max_check_attempts    3
    normal_check_interval    5
    retry_check_interval    1
}

ServiceEscalation.cfg
~~~~~~~~~~~~~~~~~~~~~

define serviceescalation {
    hostgroup_name    Macquarie_internet_zone
    service_description    Check HTTP [MCT]
    first_notification    2
    last_notification    0
    notification_interval    360
    contact_groups    webdevoncall
}
define serviceescalation {
    hostgroup_name    Macquarie_internet_zone
    service_description    Check HTTP [MCT]
    first_notification    2
    last_notification    2
    notification_interval    60
    contact_groups    servicedesk,webpub
}

Thanks!

Steve
--
Stephen Collins
Web Development Section
eBusiness Division
__________________________________________________
Department of Industry, Tourism and Resources Level 12, 20 Allara Street,
Canberra City ACT 2600 GPO Box 9839, Canberra ACT 2601

E steve.collins at industry.gov.au
P +61 2 62137193
C +61 410 680722
F +61 2 62136227


**********************************************************************
The information contained in this e-mail, and any attachments to it, is
intended for the use of the addressee and is confidential. If you are not
the intended recipient you must not use, disclose, read, forward, copy or
retain any of the information. If you have received this e-mail in error,
please delete it and notify the sender by return e-mail or telephone.
The Commonwealth does not warrant that any attachments are free from viruses
or any other defects. You assume all liability for any loss, damage or other
consequences which may arise from opening or using the attachments.
****************************************************************************
*******

NHSXunZUKB+'au??*y?lUKvvw?




-------------------------------------------------------
This SF.Net email is sponsored by BEA Weblogic Workshop
FREE Java Enterprise J2EE developer tools!
Get your free copy of BEA WebLogic Workshop 8.1 today.
http://ads.osdn.com/?ad_idP47&alloc_id808&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list