Alternate check interval when state become CRITICAL

Justin Pasher justinp at newmediagateway.com
Tue Feb 10 22:23:24 CET 2009


Thomas Guyot-Sionnest wrote:
>> What I would like to do is have the check interval change to every one 
>> minute when the state become CRITICAL, but keep the notifications at 5 
>> minute intervals.
>>     
>
> It's simple: use an eventhandler.
>
> You can look at this for inspiration, although you would likely need
> some more details to understand what I'm trying to do there...
> http://solaris.beaubien.net/~dermoth/media/nagios/handle_stall_counter
>   

Alrighty. I took the script above as the base and tweaked it to my 
setup. The theory behind the code is working, but there is still one 
caveat. When the service goes into a HARD CRITICAL state, the event 
handler is called and it correctly sends the command to Nagios to update 
the check interval. The problem is that when the command is sent to 
Nagios, Nagios has already set the next scheduled check (which defaults 
to five minutes out). This means the next service check still won't 
happen for another five minutes. After the next check occurs, if the 
service is still in a HARD CRITICAL state, the NEXT scheduled check will 
follow the new check interval that was set by the event handler (one 
minute). At that time, it will continue to perform checks at one minute 
intervals until the service is normal again.

Once the service is back to a normal state, the event handler is called 
again, which send the command to Nagios to change the check interval 
back to five minutes. However, like before, the next scheduled check has 
already been set (one minute out), so the check happens again in one 
minute. If the service is still up, it applies the check interval set by 
the event handler.

In the latter instance, it's not that big of a deal since it just causes 
another check a little sooner than usual. However, in the first 
instance, because the next scheduled check is still five minutes out the 
first time around, it defeats the whole purpose of having the custom 
event handler

Do you know any way around this? I've attached the service info and 
event handler for reference.


Justin Pasher


==============================
define service {
    host_name                       myhost
    service_description             www.myhost.com
    check_command                   check_http2!www.myhost.com!25!50
    contact_groups                  admins
    event_handler                   change_check_interval
    use                             nmg-service
}

define command {
    command_name    change_check_interval
    command_line    /etc/nagios3/change_check_interval $HOSTNAME$ 
$HOSTADDRESS$ $SERVICEDESC$ $SERVICESTATE$ $SERVICESTATETYPE$ 
$SERVICEATTEMPT$
}


==============================
/etc/nagios3/change_check_interval:


#!/usr/bin/perl

use strict;
use warnings;

# Fork to let Nagios keep on working...
if (fork != 0) {
    # Nobody cares if fork failed...
    warn("Daemonizing... Thanks for calling me.");
    exit(0);
}

die("Usage: $0 <hostname> <hostaddress> <service desc> <state> 
<statetype> <stateattempt>") unless (@ARGV == 6);

my $commandfile     = '/var/lib/nagios3/rw/nagios.cmd';
my $hostname        = $ARGV[0];
my $hostaddress     = $ARGV[1];
my $servicedesc     = $ARGV[2];
my $state           = $ARGV[3];
my $statetype       = $ARGV[4];
my $stateattempt    = $ARGV[5];

# If state becomes HARD WARNING, change the check interval to something
# smaller so the check eventually gets back to OK.
if ($state eq 'CRITICAL' && $statetype eq 'HARD')
{
    open(CMD, ">>$commandfile");
    printf(CMD "[%lu] CHANGE_NORMAL_SVC_CHECK_INTERVAL;%s;%s;1\n", time, 
$hostname, $servicedesc);
    close(CMD);
    die("Check interval for $hostname set to 1 minute");
}

# If state becomes HARD OK, revert the check interval to yearly check in
# order to avoid flooding Nagios logs.
if ($state eq 'OK' && $statetype eq 'HARD')
{
    open(CMD, ">>$commandfile");
    printf(CMD "[%lu] CHANGE_NORMAL_SVC_CHECK_INTERVAL;%s;%s;5\n", time, 
$hostname, $servicedesc);
    close(CMD);
    die("Check interval for $hostname set to 5 minutes");
}


------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list