Event Handlers Problem

Thomas Beecher tbeecher at localnet.com
Thu May 19 14:54:28 CEST 2005


Problem solved!!!!

For whatever reason, I had to add /usr/bin/perl to the command definition. So, 
instead of this:

$USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$

it has to be:

/usr/bin/perl $USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$

This works flawlessly.

It doesn't make a whole lot of sense to me why this is required. The nagios user 
can run any other perl script on the system that it has permissions for, and all 
  have /usr/bin/perl defined as the interpreter path, so it't can't be a 
permissions issue with the perl interpreter. Maybe has something to do with this 
copy of Nagios having been compiled with the embedded perl module.

Either way, thought I post this up here in case someone else had similar issues.

Thomas Beecher II
Network Administrator
LocalNet Corp.
tbeecher at localnet.com

Thomas Beecher wrote:
> Well, to answer some questions that has been posed.
> 
> 1. My testing was taking place on a 2.0b3 installation of Nagios. I was, 
> however, able to replicate the behavior on a 1.1 install, our production 
> instance, seperate box.
> 
> 2. I tested the perl script as the nagios user to make sure it would 
> actually run as expected, and it did.
> 
> 3. I have tried to call the script from inside shell wrapper, to no 
> avail. The shell script works fine if I call it directly (again, tested 
> as myself and as the nagios user). I tested it naked, with no params , 
> with constants, and with macro variables, all nada. In all cases, if I 
> tee the output to a dummy file to see what the output is, it's ONLY the 
> arguments that come out, never anything from the script.
> 
> So, In essance, I'm still stuck. I'm cheating right now, taking the only 
> output I can get (the macros), dumping that to a file, and running 
> another script as a cron job that parses the file holding the marco 
> variables, then calling the perl script. I know, it's a roundabout way 
> of doing it, but it accomplishes the same task. I don't plan to try that 
> on our production copy of nagios, only because there's already enough 
> cron jobs running every 5 minutes, but thought I'd toss it out there in 
> case someone else wants to try that for themselves.
> 
> Thanks to everyone for all your help, if I ever get it working the 
> correct way I'll be sure to post again and let you know.
> 
> 
> Thomas Beecher II
> Network Administrator
> LocalNet, Inc
> tbeecher at localnet.com
> 
> Lewis Getschel wrote:
> 
>> I'll chip in my 3 cents on this topic...
>>
>> I don't see which version of Nagios you are running, My comments here 
>> are based on my Nagios 1.2 server
>>
>> I found 2 different issues that prevented my event_handlers from running:
>> 1)
>> When I tackled event_handlers I had written in tcsh script, I found a 
>> similar result to you, it LOOKED like it was being called (from event 
>> log), but even the simple "echo $1 >/tmp/nagios_debug.txt" didn't do 
>> anything. I finally tracked it down to (on MY system at least) Nagios 
>> wouldn't execute /bin/tcsh scripts! I rewrote to /bin/sh instead, and 
>> it started to work.
>>
>> 2)
>> This one had me baffled for 3 months (!).
>> My event handler command called for sending 6 variables 
>> (command_line    $USER1$/event_handler_diskmail.original  
>> $SERVICESTATE$ $STATETYPE$ $SERVICEATTEMPT$ $HOSTNAME$ $SERVICEDESC$ 
>> $OUTPUT$). I had similar result, it didn't run.
>> I "solved" it when:
>> I changed it to 1 parameter, the script ran.
>> I added all 6, it stopped again.
>> I changed it to just 2, it ran again
>> I changed it to all 6, it stopped (see the pattern here <smirk>)
>> I ended up building it up 1 parameter at a time until I got to 5. It ran.
>> When I tried adding the output, it stopped again. I gave up at that 
>> point and left it at 5 (originally 6 months ago it did work with all 
>> 6, then it mysteriously stopped), my event handler doesn't handle 1 
>> particular case now (I can live with that for now).
>>
>> Try changing your parameters to constants (that was my first hint of 
>> trouble shooting) 1 2 3 and see if the script gets the constants at 
>> least.
>>
>> My 3 cents, FWIW
>> Lewis
>>
>>
>> Thomas Beecher wrote:
>>
>>> I am restarting the service after every change, so that's not an 
>>> issue. Learned that the hard way about 6 months ago...;-p
>>>
>>> When I change to this:
>>>
>>> command_line    /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>>>
>>> I do get the appropriate variables passed to the temp file. So it 
>>> would appear that the event handler is at least being called, that's 
>>> one less thing to look at!!
>>>
>>> This is where I'm at now. If I do this:
>>>
>>> /usr/local/nagios/libexec/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$ >> 
>>> /tmp/testing
>>>
>>> from a command line, I get the proper output for the script. 
>>> (normally won't be any, but I inserted some for troubleshooting.) If 
>>> I call this from the event_handler, I only get the macro variables, 
>>> nothing from the  script.
>>>
>>>
>>> Thomas Beecher II
>>> Network Administrator
>>> LocalNet, Inc
>>> tbeecher at localnet.com
>>>
>>> Marc Powell wrote:
>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
>>>>> admin at lists.sourceforge.net] On Behalf Of Thomas Beecher
>>>>> Sent: Wednesday, May 04, 2005 10:57 AM
>>>>> To: nagios-users at lists.sourceforge.net
>>>>> Subject: Re: [Nagios-users] Event Handlers Problem
>>>>>
>>>>> Well, that was a serious brain fart on my part!!
>>>>>
>>>>> I moved the script to /usr/local/nagios/libexec/, and changed
>>>>> checkcommands.cfg to show:
>>>>>
>>>>> define command{
>>>>>         command_name    restart_pm3
>>>>>         command_line    $USER1$/restart_pm3.pl $HOSTNAME$ $HOSTSTATE$
>>>>>         }
>>>>>
>>>>> $USER1$ is defined in resource.cfg as
>>>>>
>>>>> $USER1$=/usr/local/nagios/libexec
>>>>>
>>>>> Permissions on the file are:
>>>>>
>>>>> -rwxr-xr-x  1 nagios   nagios   1701 2005-05-04 10:47 restart_pm3.pl
>>>>>
>>>>> I've changed the ownership to nagios/nagios to prevent any other
>>>>> potential permission issues.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> All Good.
>>>>
>>>>
>>>>> This returns the following:
>>>>>
>>>>> [1115221615] HOST ALERT: buftest;DOWN;SOFT;1;Telnet: CRITICAL - Socket
>>>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>>>
>>>>
>>>>
>>>>
>>>> (
>>>>
>>>>>  )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>>>> [1115221615] HOST EVENT HANDLER: buftest;DOWN;SOFT;1;restart_pm3
>>>>> [1115221625] HOST ALERT: buftest;DOWN;SOFT;2;Telnet: CRITICAL - Socket
>>>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>>>
>>>>
>>>>
>>>>
>>>> (
>>>>
>>>>>  )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>>>> [1115221625] HOST EVENT HANDLER: buftest;DOWN;SOFT;2;restart_pm3
>>>>> [1115221634] HOST ALERT: buftest;DOWN;SOFT;3;Telnet: CRITICAL - Socket
>>>>> timeout after 1 seconds<br>SNMP: CRITICAL: snmpget returned errors: 1
>>>>
>>>>
>>>>
>>>>
>>>> (
>>>>
>>>>>  )<br>PING: CRITICAL - Host Unreachable (10.0.2.152)
>>>>> [1115221634] HOST EVENT HANDLER: buftest;DOWN;SOFT;3;restart_pm3
>>>>>
>>>>> It doesn't error out, and seems to call the script, but it still
>>>>
>>>>
>>>>
>>>>
>>>> doesn't
>>>>
>>>>> run.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Did you remember to restart Nagios after making the config changes?
>>>>
>>>>
>>>>> I have not tested the script as the Nagios user, however the front of
>>>>> the script is set to dump whatever args get passed to it out to a file
>>>>> before doing anything else, so if it was choking somwhere in the
>>>>
>>>>
>>>>
>>>>
>>>> script
>>>>
>>>>> it would still be logged that it ran.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Testing as the user the script is running as should be done anyway.
>>>> Often times it shows other, less obvious issues like required libraries
>>>> not being readable or executable by that user. I'd also suggest
>>>> simplifying things greatly by making your event handler  --
>>>>
>>>> command_line    /bin/echo "$HOSTNAME$ $HOSTSTATE$" > /tmp/testing
>>>>
>>>> Just to make sure that it's getting run (I'm 99% sure that'll work as I
>>>> expect ;) ).
>>>>
>>>> -- 
>>>> Marc
>>>
>>>
>>>
>>
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by Oracle Space Sweepstakes
> Want to be the first software developer in space?
> Enter now for the Oracle Space Sweepstakes!
> http://ads.osdn.com/?ad_id=7393&alloc_id=16281&op=click
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when 
> reporting any issue. ::: Messages without supporting info will risk 
> being sent to /dev/null



-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list