Event Handlers Problem

Lewis Getschel lgetschel at denver.westerngeco.slb.com
Thu May 5 01:25:38 CEST 2005


I'll chip in my 3 cents on this topic...

I don't see which version of Nagios you are running, My comments here 
are based on my Nagios 1.2 server

I found 2 different issues that prevented my event_handlers from running:
1)
When I tackled event_handlers I had written in tcsh script, I found a 
similar result to you, it LOOKED like it was being called (from event 
log), but even the simple "echo $1 >/tmp/nagios_debug.txt" didn't do 
anything. I finally tracked it down to (on MY system at least) Nagios 
wouldn't execute /bin/tcsh scripts! I rewrote to /bin/sh instead, and it 
started to work.

2)
This one had me baffled for 3 months (!).
My event handler command called for sending 6 variables (command_line    
$USER1$/event_handler_diskmail.original  $SERVICESTATE$ $STATETYPE$ 
$SERVICEATTEMPT$ $HOSTNAME$ $SERVICEDESC$ $OUTPUT$). I had similar 
result, it didn't run.
I "solved" it when:
I changed it to 1 parameter, the script ran.
I added all 6, it stopped again.
I changed it to just 2, it ran again
I changed it to all 6, it stopped (see the pattern here <smirk>)
I ended up building it up 1 parameter at a time until I got to 5. It ran.
When I tried adding the output, it stopped again. I gave up at that 
point and left it at 5 (originally 6 months ago it did work with all 6, 
then it mysteriously stopped), my event handler doesn't handle 1 
particular case now (I can live with that for now).

Try changing your parameters to constants (that was my first hint of 
trouble shooting) 1 2 3 and see if the script gets the constants at least.

My 3 cents, FWIW
Lewis


Thomas Beecher wrote:

> I am restarting the service after every change, so that's not an 
> issue. Learned that the hard way about 6 months ago...;-p
>
> When I change to this:
>
> command_line    /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>
> I do get the appropriate variables passed to the temp file. So it 
> would appear that the event handler is at least being called, that's 
> one less thing to look at!!
>
> This is where I'm at now. If I do this:
>
> /usr/local/nagios/libexec/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$ >> 
> /tmp/testing
>
> from a command line, I get the proper output for the script. (normally 
> won't be any, but I inserted some for troubleshooting.) If I call this 
> from the event_handler, I only get the macro variables, nothing from 
> the  script.
>
>
> Thomas Beecher II
> Network Administrator
> LocalNet, Inc
> tbeecher at localnet.com
>
> Marc Powell wrote:
>
>>
>>> -----Original Message-----
>>> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
>>> admin at lists.sourceforge.net] On Behalf Of Thomas Beecher
>>> Sent: Wednesday, May 04, 2005 10:57 AM
>>> To: nagios-users at lists.sourceforge.net
>>> Subject: Re: [Nagios-users] Event Handlers Problem
>>>
>>> Well, that was a serious brain fart on my part!!
>>>
>>> I moved the script to /usr/local/nagios/libexec/, and changed
>>> checkcommands.cfg to show:
>>>
>>> define command{
>>>         command_name    restart_pm3
>>>         command_line    $USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$
>>>         }
>>>
>>> $USER1$ is defined in resource.cfg as
>>>
>>> $USER1$=/usr/local/nagios/libexec
>>>
>>> Permissions on the file are:
>>>
>>> -rwxr-xr-x  1 nagios   nagios   1701 2005-05-04 10:47 restart_pm3.pl
>>>
>>> I've changed the ownership to nagios/nagios to prevent any other
>>> potential permission issues.
>>
>>
>>
>> All Good.
>>
>>
>>> This returns the following:
>>>
>>> [1115221615] HOST ALERT: buftest;DOWN;SOFT;1;Telnet: CRITICAL - Socket
>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>
>>
>> (
>>
>>>  )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>> [1115221615] HOST EVENT HANDLER: buftest;DOWN;SOFT;1;restart_pm3
>>> [1115221625] HOST ALERT: buftest;DOWN;SOFT;2;Telnet: CRITICAL - Socket
>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>
>>
>> (
>>
>>>  )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>> [1115221625] HOST EVENT HANDLER: buftest;DOWN;SOFT;2;restart_pm3
>>> [1115221634] HOST ALERT: buftest;DOWN;SOFT;3;Telnet: CRITICAL - Socket
>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>
>>
>> (
>>
>>>  )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>> [1115221634] HOST EVENT HANDLER: buftest;DOWN;SOFT;3;restart_pm3
>>>
>>> It doesn't error out, and seems to call the script, but it still
>>
>>
>> doesn't
>>
>>> run.
>>
>>
>>
>> Did you remember to restart Nagios after making the config changes?
>>
>>
>>> I have not tested the script as the Nagios user, however the front of
>>> the script is set to dump whatever args get passed to it out to a file
>>> before doing anything else, so if it was choking somwhere in the
>>
>>
>> script
>>
>>> it would still be logged that it ran.
>>
>>
>>
>> Testing as the user the script is running as should be done anyway.
>> Often times it shows other, less obvious issues like required libraries
>> not being readable or executable by that user. I'd also suggest
>> simplifying things greatly by making your event handler  --
>>
>> command_line    /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>>
>> Just to make sure that it's getting run (I'm 99% sure that'll work as I
>> expect ;) ).
>>
>> -- 
>> Marc
>

-- 
Lewis Getschel             | Today is done...
WesternGeco                |     Today was fun...
1625 Broadway              |         Tomorrow is another one.
Denver, CO 80202           |
Direct Phone - 303-389-4407|        -- Dr. Seuss --



-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list