soft state turns to hard state after a delay

Marc Powell marc at ena.com
Mon Mar 14 20:03:15 CET 2005



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of Matthias Bertschy
> Sent: Monday, March 14, 2005 11:57 AM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] soft state turns to hard state after a delay
> 
> Hello,
> 
> Right now I am evaluating Nagios for an extensive use within our
> company. I have a question about soft/hard state transformation.
> 
> On the page:
> http://nagios.sourceforge.net/docs/2_0/statetypes.html
> I read:
> Hard states occur (...) (w)hen a service check results in a non-OK
state
> and it has been (re)checked the number of times specified by the
> </max_check_attempts/> option in the service definition.
> 
> Would it be possible to specify a time period allowed for the service
to
> recover instead of a number of checks? For example, I would like to
let
> a service (such as apache) try to recover from its own during 30min
> before warning the administrator. Is that possible (ideally without
> tweaking the </max_check_attempts/> and the delay between checks)?


You can't do that specifically with config options but you could
probably submit a scheduled downtime external command for the service.
That definitely feels like a hack though.

The combination of max_check_attempts and retry_check_interval are there
specifically for this purpose. What is nagios supposed to be doing
during this recovery time in your scenario? Is it checking the service
or are you going to just show it down for 30 minutes without checking to
see if it's recovered during that period. Using the combination of
normal_check_interval, retry_check_interval and max_check_attempts
allows for a lot of flexibility. For example, there are several ways to
meet your scenario --

normal_check_interval 5 ;check every 5 minutes by default
retry_check_interval  30 ; wait 30 minutes between checks when not OK
max_check_attempts    2 ; alert after the second non-OK check result

normal_check_interval 5 ;check every 5 minutes by default
retry_check_interval  10 ; wait 10 minutes between checks when not OK
max_check_attempts    4 ; alert after the 4th non-OK check result

normal_check_interval 5 ;check every 5 minutes by default
retry_check_interval  3 ; wait 3 minutes between checks when not OK
max_check_attempts    11 ; alert after the eleventh non-OK check result

--
Marc


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list