Service event handler script not activating

Dennis Hünseler dennis at huenseler.net
Sat Nov 15 08:08:20 CET 2008


Hi Darren,

did you check the commands on command line as root or as nagios user?
Because i think your problem is the user which runs the commands.

Regards, Dennis

On Fri, 14 Nov 2008 18:22:59 -0700, "Darren Spruell"
<phatbuckett at gmail.com>
wrote:
> I had success with the first service eventhandler we implemented and
> am now trying to duplicate it for a second service (with a slightly
> modified eventhandler script) and failing. The eventhandler that is
> failing is for the "Argus Daemon" service on host quagmire, and the
> failure is that although the logs show the event handler being called
> with expected arguments, the command executed from the eventhandler
> script (an SSH connection to the target host) is never observed. If we
> call the eventhandler script manually with the same expected
> arguments, it operates properly (SSH connection occurs and remote
> service is started).
> 
> Running Nagios 2.10 (nagios-2.10-3.fc7) on Fedora 7 GNU/Linux. SELinux
> enabled but set to not enforce.
> 
> # /etc/nagios/nagios.cfg
> log_event_handlers=1
> event_handler_timeout=30
> enable_event_handlers=1
> 
> # /etc/nagios/definitions.cfg (object configuration file)
> define service{
>         name                            generic-service         ; The
> 'name' of this service template
>         ...
>         event_handler_enabled           1                       ;
> Service event handler is enabled
>         register                        0
>         }
> 
> define service{
>         use                             ti-service
>         host_name                       quagmire
>         service_description             Argus Daemon
>         check_command                   check_nrpe!check_proc_argus
>         event_handler
> handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
>         }
> 
> # /etc/nagios/commands.cfg
> define command{
>         command_name    handler_restart_service_openbsd
>         command_line
> $USER1$/eventhandlers/service-restart-openbsd.sh $HOSTADDRESS$ $ARG2$
> $SERVICESTATE$ $SERVICES
> TATETYPE$ $SERVICEATTEMPT$ $ARG1$
>         }
> 
> # service-restart-openbsd.sh:
> $ ls -lZ /usr/lib64/nagios/plugins/eventhandlers/
> -rwxr-xr-x  root root system_u:object_r:bin_t         
> service-restart-linux.sh
> -rwxr-xr-x  root root system_u:object_r:bin_t
> service-restart-openbsd.sh
> 
> ----- snip -----
> #!/bin/sh
> #
> #    $Id$
> #
> # Event handler script for restarting a service. The idea of a "service"
on
> # OpenBSD doesn't really work as it doesn't use a SysV init but a
> monolithic
> # rc. For this reason we call a script on the remote server and don't
> # parameterize paths to an init script in this handler.
> #
> # [Attribution] taken from example in Nagios documentation at:
> # http://nagios.sourceforge.net/docs/2_0/eventhandlers.html
> #
> # Note: This script will only restart the service if the service is
> #       retried 3 times (in a "soft" state) or if the service somehow
> #       manages to fall into a "hard" error state.
> #
> # Host to connect to
> DST_HOST="$1"
> # User to connect as via SSH
> DST_USER="$2"
> # Service state (OK, WARNING, etc.)
> SVC_STATE="$3"
> # Service type (SOFT, HARD, etc.)
> SVC_STATE_TYPE="$4"
> # Service attempt (3, 4, etc.)
> SVC_STATE_ATTEMPT="$5"
> # Script name (full path.)
> SVC_NAME="$6"
> 
> case "$SVC_STATE" in
>     # Only deal with services that have dropped to CRITICAL state.
>     CRITICAL)
>         case "$SVC_STATE_TYPE" in
>             # SOFT failures we deal with once it becomes apparent that
>             # the failure is definate (on the third failure, before
>             # notifications are sent out.)
>             SOFT)
>                 case "$SVC_STATE_ATTEMPT" in
>                     3)
>                         /usr/bin/ssh -tt -l $DST_USER $DST_HOST sudo
>                         $SVC_NAME
>                         ;;
>                 esac
>                 ;;
>             HARD)
>                 # If we hit a HARD failure, attempt to deal with it one
>                 # last time.
>                 /usr/bin/ssh -tt -l $DST_USER $DST_HOST sudo $SVC_NAME
>                 ;;
>         esac
>         ;;
> esac
> 
> # Eventhandlers should always exit successfully, apparently.
> exit 0
> ----- /snip -----
> 
> Here's the logs showing detection of the service reaching the two
> states the eventhandler script should activate on (SOFT/3 and HARD):
> 
> [1226710816] SERVICE ALERT: quagmire;Argus
> Daemon;CRITICAL;SOFT;1;PROCS CRITICAL: 0 processes with command name
> 'argus'
> [1226710816] SERVICE EVENT HANDLER: quagmire;Argus
>
Daemon;CRITICAL;SOFT;1;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
> [1226710876] SERVICE ALERT: quagmire;Argus
> Daemon;CRITICAL;SOFT;2;PROCS CRITICAL: 0 processes with command name
> 'argus'
> [1226710876] SERVICE EVENT HANDLER: quagmire;Argus
>
Daemon;CRITICAL;SOFT;2;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
> [1226710936] SERVICE ALERT: quagmire;Argus
> Daemon;CRITICAL;SOFT;3;PROCS CRITICAL: 0 processes with command name
> 'argus'
> [1226710936] SERVICE EVENT HANDLER: quagmire;Argus
>
Daemon;CRITICAL;SOFT;3;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
> [1226710996] SERVICE ALERT: quagmire;Argus
> Daemon;CRITICAL;HARD;4;PROCS CRITICAL: 0 processes with command name
> 'argus'
> [1226710996] SERVICE NOTIFICATION: ti;quagmire;Argus
> Daemon;CRITICAL;notify-by-email;PROCS CRITICAL: 0 processes with
> command name argus
> [1226710996] SERVICE EVENT HANDLER: quagmire;Argus
>
Daemon;CRITICAL;HARD;4;handler_restart_service_openbsd!/usr/local/bin/start-argus!_nrpe
> 
> I can manually invoke it and have it succeed:
> 
> $ su - nagios
> $ sh -x
/usr/lib64/nagios/plugins/eventhandlers/service-restart-openbsd.sh
> quagmire.local _nrpe CRITICAL HARD 4 start-argus
> + DST_HOST=quagmire.local
> + DST_USER=_nrpe
> + SVC_STATE=CRITICAL
> + SVC_STATE_TYPE=HARD
> + SVC_STATE_ATTEMPT=4
> + SVC_NAME=start-argus
> + case "$SVC_STATE" in
> + case "$SVC_STATE_TYPE" in
> + /usr/bin/ssh -tt -l _nrpe quagmire.local sudo
/usr/local/bin/start-argus
> Connection to quagmire.local closed.
> + exit 0
> 
> When invoked manually I see the sshd log indicating the connection on
> the remote end and a sudo log indicating the command execution. When
> nagios kicks off the eventhandlers, neither of these logs are seen on
> the remote side.
> 
> Any clue where else to look?


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list