notification_interval seems to be ignored

Zembower, Kevin kzembowe at jhuccp.org
Thu Nov 15 17:53:36 CET 2007


I've written a custom plug-in to monitor the ambient temperature probe
on my Dell PowerEdge server. It's a wrapper around the standard
check_smtp plugin. It's been working correctly for months now. I pasted
the code for this in at the end of this message.

Yesterday at 2:00 the compressor failed in my server room and the
temperature went up to 90F. My notification was correctly sent, and I
was happy. However, I've received a notification every 5 minutes since
then that the temperature is over 80.

In the service definition, I have notification_intervals set to 0:
# check that ambient temperature from Dell sensor is less than 80 and 90
degree F.
# Notify the 'temp' group only.
define service {
        hostgroup_name                  temp_sensor
        service_description             Ambient Temperature 
        check_command                   check_ambtempF!80!90
#For testing, set the temperature too low
#       check_command                   check_ambtempF!60!70
        use                             generic-service
        notification_interval           0; set > 0 if you want to be
renotified
        contact_groups                  temp
}

In generic-service.nagios2.cnf, the notification_interval is also set to
0. Furthermore, I don't have any escalations defined.

And, I just discovered that even though I disabled notifications for
this service using the nagios2 GUI, I'm still getting notified every
five minutes.

Can anyone suggest anything I can try to fix this behavior? Did I
overlook something in how I wrote the plugin?

My system is Nagios 2.6, as installed by the Debian 3.0 package system.

Thanks for any advice or suggestions.

-Kevin


Kevin Zembower
Internet Services Group manager
Center for Communication Programs
Bloomberg School of Public Health
Johns Hopkins University
111 Market Place, Suite 310
Baltimore, Maryland  21202
410-659-6139 
========================================================
nagios at cn2://etc/nagios2/conf.d$ cat
/usr/lib/nagios/plugins/check_ambtempF
#! /usr/bin/perl -w

# check_ambtempF is a perl wrapper around the Nagios check_snmp plugin
#  to check the ambient temperature sensor in Dell PowerEdge servers.

# Written by Kevin Zembower, 13-Sep-2007

use strict;
use Getopt::Std;

my %opts;
getopts('dhc:w:H:',\%opts);

# Use this line below to dump the environment variables for debugging
system "env|sort >/tmp/plugins_env.$$" if defined($opts{d});

my %NAGIOS_ENV = map { $_ => $ENV{$_} } grep /^NAGIOS_/, keys %ENV;

if ( defined($opts{h}) ) {
  print <<EOF;
    Test ambient temperaure in Farenheit plugin for Nagios
    Copyright (c) 2007 Kevin Zembower

    This plugin is used to check the ambient temperature in degrees
    Farenheit on Dell PowerEdge servers using SNMP.

    Requirements:
    This plugin requires /usr/lib/nagios/plugins/check_snmp.

    usage: $0 [-h] [-d] [-H hostaddress] [-w warn] [-c crit]

    -h          print this short help message
    hostaddress address of host to check
    warn        Warning threshold value in degrees Farenheit
    crit        Critical threshold value in degrees Farenheit
    -d          Turn on debugging output

EOF
  exit;                                             
}

# for debugging
open(DMP, ">/tmp/temperature.dmp") if defined($opts{d});

my $warn= $opts{w};
my $crit= $opts{c};
my $debug = $opts{d};
my $hostaddress;
if (defined $opts{H}) { 
   $hostaddress=$opts{H}; 
   } elsif (defined($NAGIOS_ENV{NAGIOS_HOSTADDRESS})) { 
   $hostaddress = $NAGIOS_ENV{NAGIOS_HOSTADDRESS};
   } else {
   $hostaddress="127.0.0.1" 
   };

print DMP "hostaddress is $hostaddress.\n" if defined($opts{d});

my $output = "Temperature ";

$_ = `/usr/lib/nagios/plugins/check_snmp -H $hostaddress -o
.1.3.6.1.4.1.674.10892.1.700.20.1.6.1.3`;

print DMP $_ if defined($opts{d});
close(DMP) if defined($opts{d});

if ($? != 0) {  #There was an error calling the check_snmp routine...
   $output .= "UNKNOWN: CCP server room temperature could not be
determined with host $hostaddress. Probable communications or host
failure.\n";
   print $output;
   exit 3;
   }
print "Error code: $?\n" if $debug;

print $_ if $debug;

(my $tempC) = /=(\d+)/; #All the digits after the equals sign are the
temperature in tenths of a degree Celsius
$tempC /= 10; #Divide the returned value by 10
print "${tempC}C\n" if $debug;
my $tempF = (9/5*$tempC + 32);
print "${tempF}F\n" if $debug;


if ( defined $crit && $tempF >= $crit ) {
     $output .= "CRITICAL: CCP server room temperature of ${tempF}F
exceeds critical temperature of $crit\n";
     $output .= "Probable air conditioning failure.\n";
     print $output;
     exit 2;
   } elsif ( defined $warn && $tempF >= $warn ) {
     $output .= "WARNING: CCP server room temperature of ${tempF}F
exceeds warning temperature of $warn\n";
     $output .= "Probable air conditioning failure.\n";
     print $output;
     exit 1;
   } else {
     $output .= "OK: CCP server room temperature is ${tempF}F\n";
     print $output;
     exit 0;
   }


nagios at cn2://etc/nagios2/conf.d$


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list