notification_interval < normal_check_interval

Andreas Ericsson ae at op5.se
Tue Apr 19 10:30:25 CEST 2011


On 04/19/2011 03:35 AM, Paul M Dubuc wrote:
> Mike Chesnut wrote:
>> On 04/18/2011 12:08 PM, Paul M. Dubuc wrote:
>>>
>>> Mike Chesnut wrote:
>>>> I have a check that I only want to occur once a day, so I do this in the
>>>> service definition:
>>>>
>>>>              normal_check_interval   1440
>>>>
>>>> However, when it fails, I want it to retry every 10 minutes, so I do this:
>>>>
>>>>              retry_check_interval    10
>>>>
>>>> My default notification_interval is set to 15.  When I run a pre-flight
>>>> check, I get this:
>>>>
>>>> Warning: Service '<service>' on host'<host>'  has a notification
>>>> interval less than its check interval!  Notifications are only re-sent
>>>> after checks are made, so the effective notification interval will be
>>>> that of the check interval.
>>>>
>>>> Is that warning telling me that notifications are only sent when a
>>>> normal check occurs?  What I want is for in the event of a failure,
>>>> notifications to continue to be sent (every 15 minutes) until the
>>>> service recovers.  Will that be the case?
>>>>
>>>> Thanks,
>>>> Mike
>>>>
>>>
>>> What is the value of max_check_attempts?  It's at the end of that number of
>>> checks that the service enters a hard state and a notification is sent.  If
>>> the value is 1, then the warning makes perfect sense because no retry checks
>>> will be done.
>>
>> max_check_attempts is 2.  Is that a sensible number here?
>>
>> Thanks,
>> Mike
>>
> 
> OK, I think it will work this way:  You will get a notification if there
> is still a problem after the retry check.  After that, the check
> interval reverts to the normal interval and, if the problem persists
> after the retry, you will not get another notification until after the
> next normal interval check.  You will not get a recovery notification
> until then either if the problem clears up unless you rerun the check
> manually.  This doesn't sound like what you want.  I don't think you can
> do what you want without shortening the normal check interval.
> 

That's wrong, unfortunately. It will work exactly like the warning states.

First the max check attempts have to be reached and then the notifications
will start being sent out. This means you'll get one notification per time
the check is run (once every 1440 * interval_length seconds) so long as the
service remains in a non-ok state.

That's a conclusion of these rules:
1. No notifications are sent until the max check attempts are reached.
2. No notifications are sent unless the service just recently got checked.
3. The service gets re-checked using the default check-interval once the
   max check attempts have been reached.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list