fun with (silent) change from HARD to SOFT state

Ethan Galstad egalstad at nagios.org
Fri Jan 23 17:22:39 CET 2009


Michal Svoboda wrote:
> Hello,
> 
> I've discovered a weird behavior, which can be replicated thus:
> 
> 1. Let a service be configured for max attempts N before going to HARD
>    non-ok state
> 
> 2. Make the service fail and wait for N checks to pass (ie. until the
>    service enters N/N HARD non-ok state); at this point notifications
>    are sent, etc.
> 
> 3. Change the configuration of the service to have M > N max attempts
>    and restart nagios
> 
> 4. Now the state of the service is N/M _HARD_ non-ok
> 
> 5. If the N+1th check results in non-ok, then the service state goes to
>    N+1/M _SOFT_
> 
> 6. If some future check results in ok, then the service performs a SOFT
>    recovery; this results at least in no recovery notifications
> 
> 6a. if the condition in (5) does not occur, ie. the N+1th check results
>     immediately in ok, the service still performs a SOFT recovery from
>     an apparently HARD state (even according to the logs)
> 
> Now, one way to look at this behavior is that it is logical, because
> I've fiddled with the config, and I can expect anomalies and blah blah.
> 
> Another way to look at it is that there have been notifications sent in
> step (2), yet there are no recovery notifications; in other words, once
> the sirens have been sounded (and the fire brigade is on the way, and
> the president is being woken up), they should be also properly shut off.
> 
> So the question is, whether or not introduce a patch that prevents
> entering a SOFT state once a service (or a host) is already in a HARD
> non-ok state?
> 
> 
> With regards,
> Michal Svoboda

Nice catch.  I just added some code that will readjust current check
attempt at startup if the host/service was in a hard problem state.
That will accommodate config changes related to max check attempts that
are made before (re)start.

- Ethan Galstad

------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword




More information about the Developers mailing list