Host check notification problem

Jim Stosick jws at lindy.stanford.edu
Tue Mar 30 00:32:21 CEST 2004


Joe Rhett wrote:
> 
> Actually, I don't think (memory late at night?) that you should be getting
> service notifications at all during the host downtime.   This tells me that
> your host is going down, then coming back up or going into an unknown
> status.
> 
> Use the web interface, bring the host down and then find out what status it
> is going into.  I suspect that it's going into an unknown status and that's
> why service checks are still happening and why you're not getting host
> notifications.
> 
> (configuration looks fine, fyi)
 
I don't get service notifications during the host down time, but I also don't
get repeat host notifications during a long host down time.  If a service is
down while the host is up repeat service notifications are sent while the
service is down.  If the host is down host notifications are only sent when the
host is detected as down and again when the host is detected as up.  No repeat
notifications are sent during the host down time. 

Below is the nagios log for the host across a shutdown and restart.  The host
was down for about an hour, so there is about an hour separating the two sets
of log entries.  In this case there is an additional notification anomaly
in that when the host first came up the web server was down, so a host up
notification was sent followed by a service down notification.  When the web
server came up Nagios recognized this but never sent a service up notification,
though it did call the global service event handler and the service is listed
as up in the web page.  There is no notification_option or notification_period 
setting that would prevent sending the notification. 


[1080581300] HOST ALERT: sherpa;DOWN;SOFT;1;CRITICAL - Plugin timed out after 10 seconds
[1080581300] GLOBAL HOST EVENT HANDLER: sherpa;DOWN;SOFT;1;notify-smarts
[1080581312] HOST ALERT: sherpa;DOWN;SOFT;2;CRITICAL - Plugin timed out after 10 seconds
[1080581312] GLOBAL HOST EVENT HANDLER: sherpa;DOWN;SOFT;2;notify-smarts
[1080581323] HOST ALERT: sherpa;DOWN;HARD;3;CRITICAL - Plugin timed out after 10 seconds
[1080581323] HOST NOTIFICATION: forsythe-sun-admin-mail;sherpa;DOWN;notify-forsythe-admin;CRITICAL - Plugin timed out after 10
seconds
[1080581324] GLOBAL HOST EVENT HANDLER: sherpa;DOWN;HARD;3;notify-smarts
[1080581325] SERVICE ALERT: sherpa;service test;CRITICAL;HARD;1;Socket timeout after 20 seconds
[1080581325] GLOBAL SERVICE EVENT HANDLER: sherpa;service test;CRITICAL;HARD;1;notify-smarts

[1080584402] HOST ALERT: sherpa;UP;HARD;1;PING OK - Packet loss = 0%, RTA = 2.00 ms
[1080584402] HOST NOTIFICATION: forsythe-sun-admin-mail;sherpa;UP;notify-forsythe-admin;PING OK - Packet loss = 0%, RTA = 2.00 ms
[1080584403] GLOBAL HOST EVENT HANDLER: sherpa;UP;HARD;1;notify-smarts
[1080584404] SERVICE NOTIFICATION: forsythe-sun-admin-mail;sherpa;service test;CRITICAL;notify-forsythe-admin;Connection refused by
host
[1080584467] SERVICE ALERT: sherpa;service test;OK;HARD;1;HTTP ok: HTTP/1.1 200 OK -   3.540 second response time
[1080584467] GLOBAL SERVICE EVENT HANDLER: sherpa;service test;OK;HARD;1;notify-smarts


This is Nagios 1.2 on Solaris 8.  The relevent host definitions are listed below.

define hostgroup {
        hostgroup_name  forsythe-testing-group
        alias           Forsythe Sun Test System
        contact_groups  forsythe-sun-admin-mail
        members         sherpa
}

define host {
        host_name               sherpa
        alias                   sherpa
        address                 sherpa.stanford.edu
        check_command           check-host-alive
        max_check_attempts      3
        notification_interval   15
        notification_period     24x7
        notification_options    d,u,r
}

define service {
        host_name               sherpa
        service_description     service test
        check_command           check_http
        max_check_attempts      3
        normal_check_interval   1
        retry_check_interval    1
        check_period            24x7
        notification_interval   15
        notification_options    w,u,c,r
        notification_period     24x7
        contact_groups          forsythe-sun-admin-mail
}


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list