retry_check_interval issue/confusion

Marc Powell marc at ena.com
Wed Mar 1 20:28:29 CET 2006



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of prosolutions at gmx.net
> Sent: Wednesday, March 01, 2006 1:14 PM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] retry_check_interval issue/confusion
> 
> 
> In http://nagios.sourceforge.net/docs/2_0/xodtemplate.html it states:
> 
> retry_check_interval:	This directive is used to define the number of
> "time units" to wait before scheduling a re-check of the service.
> Services are rescheduled at the retry interval when the have changed
to
> a non-OK state. Once the service has been retried max_attempts times
> without a change in its status, it will revert to being scheduled at
its
> "normal" rate as defined by the check_interval value. Unless you've
> changed the interval_length directive from the default value of 60,
this
> number will mean minutes.
> 
> 
> "is used to define the number of "time units" to wait before
scheduling a
> re-check of the service"  What exactly does this mean?  If a service
check
> fails, then there is clearly a need to recheck the service.  One would
> normally want to do a recheck of the service in question at a higher
rate
> than the normal scheduled (a.k.a. "normal_check_interval" rate).  Am I
> correct in this?  If so, and if that is the purpose of

Yes. This is no different than nagios (re-)scheduling the next active
check as part of its normal routines. Because you do want to perform
checks more frequently than normal in an error state, nagios switches
from using normal_check_interval to determine scheduling to using
retry_check_interval.

> retry_check_interval, then the above definition is wrong, because it
> states "number of "time units" to wait before scheduling a re-check".
It
> does not say "number of "time units" to wait before re-checking".
The
> difference here is that as defined it will wait retry_check_interval
units
> of time and then only reschedule the check.

It's correct. Under normal circumstances, Nagios performs a check. If
that check returns OK then it will re-schedule the next check based on
normal_check_interval. If the check returns non-OK then it will
re-schedule the next check based on retry_check_interval. 

> Perhaps the confusion has to do with what it means to "schedule a
> check".  I would think that in essence all future checks are
"scheduled"
> in the sense that they are defined to take place at certain future
> times, as determined by their normal_check_interval and check_period
> definitions.  But I am guessing that "schedule" in nagiosspeak
actually

This is a bad assumption. Nagios does not plan ahead when scheduling
checks. The next check time is calculated only when the current check is
completed. The only check planning that is done is for the initial
checks on start and that's merely to spread them out so that all first
checks don't happen at the same time.

--
Marc




-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list