Event handler not being called at all (?) - Solved

Lewis Getschel lgetschel at denver.westerngeco.slb.com
Wed Nov 24 17:49:21 CET 2004


Ok, I feel a little foolish. Basically, it was a "typo" error.

My service was named "linux-fsdisk1", but in my event_handler script I 
did a case for "check_fsdisk1" (the name of the nrpe command), of course 
that won't match when nagios calls it, but from the command line I typed 
it with the "incorrect" name (that the case did look for) so it worked 
manually.

The second issue was that the echo commands couldn't write into the file 
owned by root.

When I decided to simply re-write the whole thing, I noticed the mistake 
halfway.

Sorry, Lewis

-----
Lewis Getschel  wrote:

All-
After a couple of months of using Nagios, I decided to tackle
event_handlers. I have multiple file servers, and (of course) they get
filled. Using a variation of check_disk with nrpe execution, I've
developed some services for each of the servers external attached
drives. (in services.cfg)

# Service definition
define service{
use linux-service ; Name
of service template to use
name linux-fsdisk1
service_description linux-fsdisk1
check_command check_nrpe!check_fsdisk1
event_handler_enabled 1
event_handler event_dvfs-diskmail
register 0
}

I specify that my servers use this service (in hosts.cfg)

# service definition
define service{
use linux-fsdisk1
host_name dvfs001,dvfs002,dvfs003
}

This all works fine to check and report the disk drives usage. Now I
want to define an event for the fileserver to send mail to the users who
have files on the (filling-up) disk (in services.cfg)

define command{
command_name event_dvfs-diskmail
command_line
/usr/lib/nagios/plugins/event_handler_dvfs-diskmail $SERVICESTATE$
$STATETYPE$ $SERVICEATTEMPT$ $HOSTADDRESS$ $SERVICEDESC$
}

The event handler itself is a shell script (based on the restart_http).
I added 2 extra fields for the host_address and the service_description
to make the "case" decisions work on the correct host (by caling nrpe
commands on the affected host system). This part works fine too. as my
output shows: (I'm running as the nagios user)

[nagios(at)dvws001 plugins]$ whoami
nagios
[nagios(at)dvws001 plugins]$
/usr/lib/nagios/plugins/event_handler_dvfs-diskmail WARNING SOFT 3 
dvfs002 check_fsdisk1
Starting the diskmail routine on dvfs002
OK - Mail sent
[nagios(at)dvws001 plugins]$

When I change the warning/critical levels for checking the disk so that
Nagios signals Warning's as you can see in the nagios.log file the
system changes from an "OK;HARD" condition into WARNING;SOFT, and
finally WARNING;HARD on the 4th check. (this shows 2 partitions, I only
listed code for 1 above)

[11-22-2004 13:31:26] SERVICE ALERT:
dvfs002;linux-fsdisk2;OK;HARD;4;DISK OK [176922576 kB (10%) free on
/dev/sdc1]
[11-22-2004 13:31:26] SERVICE ALERT:
dvfs002;linux-fsdisk1;OK;HARD;4;DISK OK [1017545920 kB (59%) free on
/dev/sdb1]
[11-22-2004 13:21:26] SERVICE ALERT:
dvfs002;linux-fsdisk2;WARNING;HARD;4;DISK WARNING [176922576 kB (10%)
free on /dev/sdc1]
[11-22-2004 13:21:26] SERVICE ALERT:
dvfs002;linux-fsdisk1;WARNING;HARD;4;DISK WARNING [1017545920 kB (59%)
free on /dev/sdb1]
[11-22-2004 13:18:26] SERVICE ALERT:
dvfs002;linux-fsdisk2;WARNING;SOFT;3;DISK WARNING [176922576 kB (10%)
free on /dev/sdc1]
[11-22-2004 13:18:26] SERVICE ALERT:
dvfs002;linux-fsdisk1;WARNING;SOFT;3;DISK WARNING [1017545920 kB (59%)
free on /dev/sdb1]
[11-22-2004 13:15:26] SERVICE ALERT:
dvfs002;linux-fsdisk2;WARNING;SOFT;2;DISK WARNING [176922576 kB (10%)
free on /dev/sdc1]
[11-22-2004 13:15:26] SERVICE ALERT:
dvfs002;linux-fsdisk1;WARNING;SOFT;2;DISK WARNING [1017545920 kB (59%)
free on /dev/sdb1]
[11-22-2004 13:12:26] SERVICE ALERT:
dvfs002;linux-fsdisk2;WARNING;SOFT;1;DISK WARNING [176922576 kB (10%)
free on /dev/sdc1]
[11-22-2004 13:12:26] SERVICE ALERT:
dvfs002;linux-fsdisk1;WARNING;SOFT;1;DISK WARNING [1017545920 kB (59%)
free on /dev/sdb1]
[11-22-2004 13:10:26] SERVICE ALERT:
dvfs002;linux-fsdisk1;OK;HARD;4;DISK OK [1017545920 kB (59%) free on
/dev/sdb1]

The notifications on this host only show: (notify-by-email) contacts
dvfs002 linux-fsdisk2 CRITICAL 11-22-2004 14:27:54 dets05
notify-by-email DISK CRITICAL [176922576 kB (10%) free on /dev/sdc1]
dvfs002 linux-fsdisk2 OK 11-22-2004 12:38:26 dets05 notify-by-email
DISK OK [176922576 kB (10%) free on /dev/sdc1]
dvfs002 linux-fsdisk2 ACKNOWLEDGEMENT (CRITICAL) 11-22-2004 11:45:50
dets05 notify-by-email monitor disk usage

I found some lines in the nagios.log file like this:
nagios.log:[1101158874] SERVICE EVENT HANDLER:
dvfs002;linux-fsdisk2;CRITICAL;HARD;4;event_dvfs-diskmail

Does this say that my "event_handler_dvfs-diskmail" is being executed?
I put some "echo" lines to redirect to a /tmp file, but nothing shows up
there. It really seems that the event_handler is NOT being executed.

All the references in prior postings keep telling people "don't run
scripts as root, run it as your nagios user". I _THINK_ I've taken that
into account, but still can't get anything to come out from my scripts.

Needless to say, I need Help (!) Please.

Thanks, Lewis

-- 
Lewis Getschel             | Today is done...
WesternGeco                |     Today was fun...
1625 Broadway              |         Tommorrow is another one.
Denver, CO 80202           |
Direct Phone - 303-389-4407|        -- Dr. Seuss --



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list