SIGSEGV when trying to use eventhandler

nagios nagios at chadmail.com
Tue May 18 17:42:15 CEST 2010


Hi guys,
I am new to nagios but so far it's working well for me and is monitoring a 
number of real and virtual hosts. Nagios 3.0.6 is installed on an 
OpenSolaris 2009.06 host and monitoring routers other devices and VM's in 
VirtualBox.

My issue is when I try to add an event handler, I get a SIGSEGV and nagios 
restarts.


I have posted the details of the code I am using and the error 
here...http://pastebin.com/vBb7xTND and also below (but it reads better @ 
pastebin).

I have tried several different scripts and code combinations (even empty 
scripts and commands like ls) and all give the same error.

Can anyone help me work out why it's happening?

Thanks.

hosts.cfg
<snip>
define host{
        use             windows-server  ; Inherit default values from a 
template
        host_name       Server6         ; The name we're giving to this host
        max_check_attempts              4
        event_handler   vboxmanage-restart ; Restart the vm
        alias           Server 6 - Win2008 Server       ; A longer name 
associated with the host
        address         192.168.0.6     ; IP address of the host
        }
<snip>
 
commands.cfg - note I have tried various scripts here incl. ones from the 
nagios guides/books and all give the same error.
<snip>
# 'vboxmanage_restart' command definition
define command{
        command_name vboxmanage-restart
#        command_line ls
        command_line sudo -u nas 
$USER1$/eventhandler/event_vboxmanage_restart -S $SERVICESTATE$ -T 
$SERVICESTATETYPE$ -A $SERVICEATTEMPT$ -H Server6
        }
<snip>
 
nagios.log
[1274193005] HOST ALERT: Server6;DOWN;SOFT;1;PING CRITICAL - Packet loss = 
100%
[1274193005] Caught SIGSEGV, shutting down...
[1274193005] Nagios 3.0.6 starting... (PID=5231)
[1274193005] Local time is Wed May 19 00:30:05 EST 2010
[1274193005] LOG VERSION: 2.0
[1274193005] Finished daemonizing... (New PID=5232)
 
the scripts... (yes I know it should not be 777's but just to show it's not 
a permissions thing)
-rwxrwxrwx 1 nagios nagios 1580 2010-05-18 00:52 event_vboxmanage_restart
-rwxrwxrwx 1 nagios nagios 3815 2010-05-18 23:07 filename.out
-rwxrwxrwx 1 nagios nagios 2211 2010-05-19 00:23 restart-httpd
nas at nas:/usr/nagios/libexec/eventhandler# 
 
The script work fine from the user nagios using sudo (added nagios to 
/etc/sudoers)
nas at nas:…sr/nagios/libexec/eventhandler$ whoami                            
                                                               
nagios
nas at nas:…sr/nagios/libexec/eventhandler$ sudo -u nas 
./event_vboxmanage_restart -S CRITICAL -T HARD -A 1 -H Server6               
        
CRITICAL(C) 2005-2010 Sun Microsystems, Inc.
 
The event_vboxmanage_restart script...no that this is likely to be at fault 
(I do not think anyway as I get the error with other very simple scripts 
too).
#!/usr/bin/perl
 
use Getopt::Long;
use Net::Telnet ();
use Switch;
my ($state,$type,$attempt,$cmd,$hostname);
open(MYOUTFILE, ">>/usr/nagios/libexec/eventhandler/filename.out");
 
&processargs;
print "$state"; 
switch ($state) {
    case "OK"          { &state_OK }
    case "WARNING"     { &state_WARNING }
    case "UNKNOWN"     { &state_UNKNOWN }
    case "CRITICAL"    { &state_CRITICAL }
    else               { print "unrecognised state>$state" }
}
print MYOUTFILE">$state<";
print MYOUTFILE">$hostname<";
close(MYOUTFILE);
exit 0;
 
sub processargs {
 
GetOptions (
    "S|state=s" => \$state,
    "T|type=s" => \$type,
    "A|attempt=i" => \$attempt,
    "H|hostname=s" => \$hostname,
    "C|command=s" => \$cmd,
);
}
 
### FUNC: print $state
sub print_state {
}
### FUNC: print $state
sub state_OK {
}
### FUNC: print $state
sub state_WARNING {
}
### FUNC: print $state
sub state_UNKNOWN {
}
### FUNC: print $state
sub state_CRITICAL {
if ("$type" eq "HARD" or ("$type" eq "SOFT" and $attempt == 3)) 
{@result=`VBoxManage controlvm $hostname acpipowerbutton`; foreach (@result) 
{
  print MYOUTFILE"$_\n";
};sleep(60);@result=`VBoxManage controlvm $hostname poweroff`;foreach 
(@result) {
  print MYOUTFILE"$_\n";
}; @result=`VBoxManage startvm $hostname`; print "$result[1]";
}
    else            { }
}

 As you can see from the below, it all works fine (ie. no SIGSEGV's) if I 
comment out the eventhandler line from the hosts.cfg file.
[05-19-2010 01:33:50] SERVICE ALERT: 
Server6;Explorer;OK;HARD;1;Explorer.EXE: Running
[05-19-2010 01:32:50] SERVICE ALERT: Server6;Uptime;OK;HARD;1;System Uptime 
- 0 day(s) 0 hour(s) 9 minute(s)
[05-19-2010 01:32:40] SERVICE ALERT: Server6;C:\ Drive Space;OK;HARD;1;c:\ - 
total: 39.90 Gb - used: 9.19 Gb (23%) - free 30.71 Gb (77%)
[05-19-2010 01:32:10] SERVICE ALERT: Server6;CPU Load;OK;HARD;1;CPU Load 3% 
(5 min average)
[05-19-2010 01:25:00] HOST ALERT: Server6;UP;SOFT;4;PING OK - Packet loss = 
0%, RTA = 0.44 ms
[05-19-2010 01:23:50] SERVICE ALERT: 
Server6;Explorer;CRITICAL;HARD;1;Connection refused
[05-19-2010 01:23:50] HOST ALERT: Server6;DOWN;SOFT;3;PING CRITICAL - Packet 
loss = 100%
[05-19-2010 01:23:00] SERVICE ALERT: Server6;Uptime;CRITICAL;HARD;1;CRITICAL 
- Socket timeout after 10 seconds
[05-19-2010 01:22:50] SERVICE ALERT: Server6;C:\ Drive 
Space;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:22:30] HOST ALERT: Server6;DOWN;SOFT;2;PING CRITICAL - Packet 
loss = 100%
[05-19-2010 01:22:20] SERVICE ALERT: Server6;CPU 
Load;CRITICAL;HARD;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:21:10] HOST ALERT: Server6;DOWN;SOFT;1;PING CRITICAL - Packet 
loss = 100%
[05-19-2010 01:21:00] SERVICE ALERT: Server6;Uptime;CRITICAL;SOFT;1;CRITICAL 
- Socket timeout after 10 seconds
[05-19-2010 01:20:50] SERVICE ALERT: Server6;C:\ Drive 
Space;CRITICAL;SOFT;1;CRITICAL - Socket timeout after 10 seconds
[05-19-2010 01:02:10] SERVICE ALERT: Server6;CPU Load;OK;SOFT;1;CPU Load 0% 
(5 min average)
[05-19-2010 01:00:50] SERVICE ALERT: Server6;Uptime;OK;SOFT;1;System Uptime 
- 0 day(s) 0 hour(s) 57 minute(s)
[05-19-2010 01:00:40] SERVICE ALERT: Server6;C:\ Drive Space;OK;SOFT;1;c:\ - 
total: 39.90 Gb - used: 9.19 Gb (23%) - free 30.71 Gb (77%)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100519/fa6a902f/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------

-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list