Failover configuration - "no output"

Lehman, John LEHMANJ at us.panasonic.com
Thu Oct 5 22:15:31 CEST 2006


Let me start off saying that I am sorry this is so long but I wanted to
include all of the necessary information:

 

 

OK. I am going absolutley crazy at this point. I am getting "no output"
as a result of the following configuration for Nagios "failover" as
defined in the manual. 

The following is a file I have created called failover.cfg 

#################### Host Group Listing ################ 
define hostgroup{ 
hostgroup_name Nagios_Master 
alias Nagios-Master 
contact_groups Emails_to_GNPC_Staff 
members Nagios-Master 
} 

################### Host Denfinition Listing ################# 

# Generic host definition template 
define host{ 
name generic-host10 ; The name of this host template - referenced in
other host definitions, used for template recursion/resolution 
check_command check-nagios 
max_check_attempts 10 
notification_interval 10 
notification_period 24x7 
notification_options d,r 
notifications_enabled 1 ; Host notifications are enabled 
event_handler_enabled 1 ; Host event handler is enabled 
flap_detection_enabled 1 ; Flap detection is enabled 
process_perf_data 1 ; Process performance data 
retain_status_information 1 ; Retain status information across program
restarts 
retain_nonstatus_information 1 ; Retain non-status information across
program restarts 
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL HOST, JUST A
TEMPLATE! 
} 

define host{ 
use generic-host10 ; Name of host template to use 
host_name Nagios-Master 
alias Nagios Master 
address 10.130.4.80 
} 


################### Service Denfinition Listing ################# 

# Generic service definition template 
define service{ 
name generic-service10 ; The 'name' of this service template, referenced
in other service definitions 
service_description NAGIOS 
is_volatile 0 
check_period 24x7 
max_check_attempts 3 
normal_check_interval 5 
retry_check_interval 1 
contact_groups Emails_to_GNPC_Staff 
notification_interval 15 
notification_period 24x7 
notification_options c,r 
check_command check_nagios 
active_checks_enabled 1 ; Active service checks are enabled 
passive_checks_enabled 1 ; Passive service checks are enabled/accepted 
parallelize_check 1 ; Active service checks should be parallelized
(disabling this can lead to major performance problems) 
obsess_over_service 1 ; We should obsess over this service (if
necessary) 
check_freshness 0 ; Default is to NOT check service 'freshness' 
notifications_enabled 1 ; Service notifications are enabled 
event_handler_enabled 1 ; Service event handler is enabled 
flap_detection_enabled 1 ; Flap detection is enabled 
process_perf_data 1 ; Process performance data 
retain_status_information 1 ; Retain status information across program
restarts 
retain_nonstatus_information 1 ; Retain non-status information across
program restarts 
register 0 ; DONT REGISTER THIS DEFINITION - ITS NOT A REAL SERVICE,
JUST A TEMPLATE! 
} 

define service{ 
use generic-service10 
host_name Nagios-Master 
service_description HOST 
check_command handle-master-host-event 
normal_check_interval 5 
notification_interval 5 
} 

define service{ 
use generic-service10 
host_name Nagios-Master 
service_description PROCESS 
check_command handle-master-proc-event 
normal_check_interval 5 
notification_interval 5 
} 


HERE is the command definitions defined in checkcommands.cfg 

define command{ 
command_name handle-master-host-event 
command_line
/usr/lib/nagios/plugins/eventhandlers/redundancy-scenario1/handle-master
-host-event $HOSTSTATE$ $STATETYPE$ 
} 

define command{ 
command_name handle-master-proc-event 
command_line
/usr/lib/nagios/plugins/eventhandlers/redundancy-scenario1/handle-master
-proc-event $SERVICESTATE$ $STATETYP 
E$ 
} 


In the nagios.log I have the following: 

[1159979791] HOST NOTIFICATION:
emailuser;Nagios-Master;DOWN;host-notify-by-email;check_nagios: Unknown
argument - (null) 


Here is what I get from the GUI for the service identified above: 


Nagios-Master 

HOST 

OK 10-04-2006 12:39:54 1d 22h 5m 17s 1/3 (No output!) 
PROCESS 


CRITICAL 10-04-2006 12:41:26 1d 22h 8m 41s 1/3 (No output!) 




hope that this helps someone point me in the right direction. 

This the dir and file contents for handle-master-host-event 

/usr/lib/nagios/plugins/eventhandlers/redundancy-scenario1/handle-master
-host-event 


#!/bin/sh 

# REDUNDANCY EVENT HANDLER SCRIPT 
# Written By: Ethan Galstad (nagios at nagios.org) 
# Last Modified: 02-18-2002 
# 
# This is an example script for implementing redundancy. 
# Read the HTML documentation on redundant monitoring for more 
# information on what this does. 

# Location of the echo and mail commands 
echocmd="/bin/echo" 
mailcmd="/bin/mail" 

# Location of the event handlers 
eventhandlerdir="/usr/lib/nagios/plugins/eventhandlers" 


# Only take action on hard host states... 
case "$2" in 
HARD) 

case "$1" in 
DOWN) 
# The master host has gone down! 
# We should now become the master host and take 
# over the responsibilities of monitoring the 
# network, so enable notifications... 

`$eventhandlerdir/enable_notifications` 


# Notify someone of what has happened with the original 
# master server and our taking over the monitoring 
# responsibilities. No one was notified of the master 
# host going down, since the notification would have 
# occurred while we were in standby mode, so this is a good idea... 

#`$echocmd "Master Nagios host is down!" | /bin/mail -s "Master Nagios
Host Is Down" admin at somedomain.com` 
#`$echocmd "Slave Nagios host has entered ACTIVE mode and taken over
network monitoring responsibilities!" | $mailcmd -s "Slave Nagios Host
Has Entered ACTIVE Mode" admin at somedomain.com` 

;; 

UP) 
# The master host has recovered! 
# We should go back to being the slave host and 
# let the master host do the monitoring, so 
# disable notifications... 

`$eventhandlerdir/disable_notifications` 


# Notify someone of what has happened. Users were 
# already notified of the master host recovery because we 
# were in active mode at the time the recovery happened. 
# However, we should let someone know that we're switching 
# back to standby mode... 

#`$echocmd "The master Nagios host has recovered, so the slave Nagios
host has returned to standby mode..." | $mailcmd -s "Slave Nagios Host
Has Returned To STANDBY Mode" admin at somedomain.com` 

;; 

esac 
;; 

esac 
exit 0



I am fairly new so "any" advice is appreciated.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20061005/d9a720bc/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list