Service event handler script not activating

Darren Spruell phatbuckett at gmail.com
Sat Nov 15 02:22:59 CET 2008


I had success with the first service eventhandler we implemented and
am now trying to duplicate it for a second service (with a slightly
modified eventhandler script) and failing. The eventhandler that is
failing is for the "Argus Daemon" service on host quagmire, and the
failure is that although the logs show the event handler being called
with expected arguments, the command executed from the eventhandler
script (an SSH connection to the target host) is never observed. If we
call the eventhandler script manually with the same expected
arguments, it operates properly (SSH connection occurs and remote
service is started).

Running Nagios 2.10 (nagios-2.10-3.fc7) on Fedora 7 GNU/Linux. SELinux
enabled but set to not enforce.

# /etc/nagios/nagios.cfg
log_event_handlers=1
event_handler_timeout=30
enable_event_handlers=1

# /etc/nagios/definitions.cfg (object configuration file)
define service{
        name                            generic-service         ; The
'name' of this service template
        ...
        event_handler_enabled           1                       ;
Service event handler is enabled
        register                        0
        }

define service{
        use                             ti-service
        host_name                       quagmire
        service_description             Argus Daemon
        check_command                   check_nrpe!check_proc_argus
        event_handler
handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
        }

# /etc/nagios/commands.cfg
define command{
        command_name    handler_restart_service_openbsd
        command_line
$USER1$/eventhandlers/service-restart-openbsd.sh $HOSTADDRESS$ $ARG2$
$SERVICESTATE$ $SERVICES
TATETYPE$ $SERVICEATTEMPT$ $ARG1$
        }

# service-restart-openbsd.sh:
$ ls -lZ /usr/lib64/nagios/plugins/eventhandlers/
-rwxr-xr-x  root root system_u:object_r:bin_t          service-restart-linux.sh
-rwxr-xr-x  root root system_u:object_r:bin_t
service-restart-openbsd.sh

----- snip -----
#!/bin/sh
#
#    $Id$
#
# Event handler script for restarting a service. The idea of a "service" on
# OpenBSD doesn't really work as it doesn't use a SysV init but a monolithic
# rc. For this reason we call a script on the remote server and don't
# parameterize paths to an init script in this handler.
#
# [Attribution] taken from example in Nagios documentation at:
# http://nagios.sourceforge.net/docs/2_0/eventhandlers.html
#
# Note: This script will only restart the service if the service is
#       retried 3 times (in a "soft" state) or if the service somehow
#       manages to fall into a "hard" error state.
#
# Host to connect to
DST_HOST="$1"
# User to connect as via SSH
DST_USER="$2"
# Service state (OK, WARNING, etc.)
SVC_STATE="$3"
# Service type (SOFT, HARD, etc.)
SVC_STATE_TYPE="$4"
# Service attempt (3, 4, etc.)
SVC_STATE_ATTEMPT="$5"
# Script name (full path.)
SVC_NAME="$6"

case "$SVC_STATE" in
    # Only deal with services that have dropped to CRITICAL state.
    CRITICAL)
        case "$SVC_STATE_TYPE" in
            # SOFT failures we deal with once it becomes apparent that
            # the failure is definate (on the third failure, before
            # notifications are sent out.)
            SOFT)
                case "$SVC_STATE_ATTEMPT" in
                    3)
                        /usr/bin/ssh -tt -l $DST_USER $DST_HOST sudo $SVC_NAME
                        ;;
                esac
                ;;
            HARD)
                # If we hit a HARD failure, attempt to deal with it one
                # last time.
                /usr/bin/ssh -tt -l $DST_USER $DST_HOST sudo $SVC_NAME
                ;;
        esac
        ;;
esac

# Eventhandlers should always exit successfully, apparently.
exit 0
----- /snip -----

Here's the logs showing detection of the service reaching the two
states the eventhandler script should activate on (SOFT/3 and HARD):

[1226710816] SERVICE ALERT: quagmire;Argus
Daemon;CRITICAL;SOFT;1;PROCS CRITICAL: 0 processes with command name
'argus'
[1226710816] SERVICE EVENT HANDLER: quagmire;Argus
Daemon;CRITICAL;SOFT;1;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
[1226710876] SERVICE ALERT: quagmire;Argus
Daemon;CRITICAL;SOFT;2;PROCS CRITICAL: 0 processes with command name
'argus'
[1226710876] SERVICE EVENT HANDLER: quagmire;Argus
Daemon;CRITICAL;SOFT;2;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
[1226710936] SERVICE ALERT: quagmire;Argus
Daemon;CRITICAL;SOFT;3;PROCS CRITICAL: 0 processes with command name
'argus'
[1226710936] SERVICE EVENT HANDLER: quagmire;Argus
Daemon;CRITICAL;SOFT;3;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
[1226710996] SERVICE ALERT: quagmire;Argus
Daemon;CRITICAL;HARD;4;PROCS CRITICAL: 0 processes with command name
'argus'
[1226710996] SERVICE NOTIFICATION: ti;quagmire;Argus
Daemon;CRITICAL;notify-by-email;PROCS CRITICAL: 0 processes with
command name argus
[1226710996] SERVICE EVENT HANDLER: quagmire;Argus
Daemon;CRITICAL;HARD;4;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe

I can manually invoke it and have it succeed:

$ su - nagios
$ sh -x /usr/lib64/nagios/plugins/eventhandlers/service-restart-openbsd.sh
quagmire.local _nrpe CRITICAL HARD 4 start-argus
+ DST_HOST=quagmire.local
+ DST_USER=_nrpe
+ SVC_STATE=CRITICAL
+ SVC_STATE_TYPE=HARD
+ SVC_STATE_ATTEMPT=4
+ SVC_NAME=start-argus
+ case "$SVC_STATE" in
+ case "$SVC_STATE_TYPE" in
+ /usr/bin/ssh -tt -l _nrpe quagmire.local sudo /usr/local/bin/start-argus
Connection to quagmire.local closed.
+ exit 0

When invoked manually I see the sshd log indicating the connection on
the remote end and a sudo log indicating the command execution. When
nagios kicks off the eventhandlers, neither of these logs are seen on
the remote side.

Any clue where else to look?

-- 
Darren Spruell
phatbuckett at gmail.com

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list