event handler runs for timeouts, not state changes, why?

Lewis Getschel lgetschel at denver.westerngeco.slb.com
Thu Feb 24 17:34:14 CET 2005


All-
    I didn't try to do this, and don't even know if nagios is supposed 
to be this specific

Short description:
   My event handler seems to be called ONLY when an 'error' occurs like 
"connection refused" or "CHECK_NRPE: Socket timeout after 25 seconds.". 
I know this because the start of the script echos the parameters into a 
file.
   But when a state change occurs, the handler doesn't echo anything, 
which seems to be proof that it isn't being called, even though the 
event log says it was called.

Longer description:
I have an event handler defined for a service, the first lines are a set 
of echos of the parameters into a text file.

Here is an event that was a timeout and the event handler 
(event_diskmail) is called:
[02-24-2005 04:22:16] SERVICE EVENT HANDLER: 
dvfs004;linux-fsdisk1;CRITICAL;SOFT;2;event_diskmail
[02-24-2005 04:22:16] SERVICE ALERT: 
dvfs004;linux-fsdisk1;CRITICAL;SOFT;2;CHECK_NRPE: Socket timeout after 
25 seconds.
and here is the text from the /tmp/nagios_event_debug.txt (echo of $1 - $9)
Thu Feb 24 04:22:16 MST 2005
1 -CRITICAL -SERVICESTATE
2 -SOFT -STATETYPE
3 -2 -SERVICEATTEMPT
4 -dvfs004 -HOSTNAME
5 -linux-fsdisk1 -SERVICEDESC
6-9 -CHECK_NRPE: Socket timeout after -OUTPUT
nagios
----------------
OK, that part's fine, the script checks for the timeout errors and exits 
properly.

Here the nagios event log shows that the handler is called for a state 
change (and a 'regular' notification was sent):
[02-24-2005 05:42:35] SERVICE EVENT HANDLER: 
dvfs001;linux-fsdisk1;WARNING;HARD;4;event_diskmail
[02-24-2005 05:42:34] SERVICE NOTIFICATION: 
deop00;dvfs001;linux-fsdisk1;WARNING;notify-by-email;DISK WARNING 
[182453808 kB (10%) free on /dev/sdb1]
[02-24-2005 05:42:34] SERVICE ALERT: 
dvfs001;linux-fsdisk1;WARNING;HARD;4;DISK WARNING [182453808 kB (10%) 
free on /dev/sdb1]
BUT, the debug text file does NOT show it being called at all (no 
text/date/parameters are present),
(Don't confuse notification email with my diskmail routine, my diskmail 
routine sends mail to the users who are running low on disk space)

Specifics:
Nagios 1.2
1160 hosts, 1396 services

in services.cfg: (I define the service that will use the event handler)
define service{
        use                             linux-service
        name                            linux-fsdisk1
        service_description             linux-fsdisk1
        check_command                   check_nrpe!check_fsdisk1
        event_handler_enabled           1
        event_handler                   event_diskmail
        register                        0
        }
(and I define the event handler itself)
# Service definition for sending email on dvfs00x systems
define command{
        command_name                    event_diskmail
        command_line    /usr/lib/nagios/plugins/event_handler_diskmail  
$SERVICESTATE$ $STATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ $SERVICEDESC$ 
$OUTPUT$
        }# Service definition

define service{
        use                             generic-service
        name                            linux-service
        is_volatile                     0
        check_period                    24x7
        max_check_attempts              4
        normal_check_interval           20
        retry_check_interval            3
        contact_groups                  ops-escalation-group , 
dets-escalation-group
        notification_interval           60
        notification_period             24x7
        notification_options            w,c,r
        register                        0
        }
 

in hosts.cfg  (I assign the service to a host system)
# service definition
define service{
     use           linux-fsdisk1
     host_name     dvfs001,dvfs002, (and so on for the rest of the servers)
}

Finally, the head of the event_handler_diskmail file: (extra file 
comments removed here)

#!/bin/sh
# Whenever I test this, I forget to run as NAGIOS, NOT root or myself, 
check it!
CURRENT_USER=`whoami`
if [ "$CURRENT_USER" != "nagios" ] ;
then
   echo ==========================================  >> 
/tmp/nagios_event_debug.txt
   echo "_WRONG_ you are running this as  $CURRENT_USER  , you should be 
_nagios_"
   echo --- WRONG user. it is not _nagios_ user >> 
/tmp/nagios_event_debug.txt
   echo `whoami`  >> /tmp/nagios_event_debug.txt
   echo =========== Bailing out of script! 
===============================  >> /tmp/nagios_event_debug.txt
   exit 255
fi
# Echo parameters that were passed for debugging purposes.
# echo "-------------------------------------------"  >> 
/tmp/nagios_event_debug.txt
echo `date` >> /tmp/nagios_event_debug.txt
# echo Passed Parameters are:
echo 1 -$1 -SERVICESTATE  >> /tmp/nagios_event_debug.txt
echo 2 -$2 -STATETYPE  >> /tmp/nagios_event_debug.txt
echo 3 -$3 -SERVICEATTEMPT  >> /tmp/nagios_event_debug.txt
echo 4 -$4 -HOSTNAME  >> /tmp/nagios_event_debug.txt
echo 5 -$5 -SERVICEDESC >> /tmp/nagios_event_debug.txt
echo 6-9 -$6 $7 $8 $9 -OUTPUT >> /tmp/nagios_event_debug.txt
echo `whoami`  >> /tmp/nagios_event_debug.txt
echo --------------------------------------------- >> 
/tmp/nagios_event_debug.txt

(BTW, the script runs perfectly fine if I sudo su nagios, and run the 
script manually with the parameters)

HELP, I can't figure out how to get the event handler to be called for 
state changes.
Any help would be greatly appreciated. Thanks.

-- 
Lewis Getschel             | Today is done...
WesternGeco                |     Today was fun...
1625 Broadway              |         Tomorrow is another one.
Denver, CO 80202           |
Direct Phone - 303-389-4407|        -- Dr. Seuss --



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list