Retry interval on hard states

Tom Sommer mail at tomsommer.dk
Fri Mar 7 18:24:05 CET 2008


Marc Powell wrote:
>> Hi,
>>
>> I wish to setup the following check interval:
>>
>> Check the service every 5 minutes
>>   -> If down then check the service every 1 minute for 3 minutes/times
>>    -> If still down, notify and continue to check the service every 1
>> minute until it recovers.
>>
>> I'm having a few problems with the last condition. Basically once the
>> notification is sent, Nagios seems to revert to the "normal" check
>> interval, which is 5 minutes - resulting in a substantial delay for
>>     
> the
>   
>> recovery notification to be sent.
>>     
>
> This is expected behavior. I'm curious, what kind of environment are you
> in when up to 5 minute delay in notification of recovery is
> 'substantial'?
>   
Well, the current environment/system we run, have the above behavior, 
and to be honest, I don't understand how it's not default behavior.
Normally you would want to know if a service have recovered as soon as 
possible, I would have it check every 30 seconds if I could.
It's especially important for people who are on call, receive a 
notification, resolve the issue, and then await confirmation of 
recovery, 5 minutes is a long wait.

A simple setting to set this interval sounds trivial and I would think 
almost required for a monitoring system.
--
Tom Sommer

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list