Host down, still doing active checks, causing multiple unwanted service failures

Marc Powell marc at ena.com
Tue Dec 9 15:17:00 CET 2008


On Dec 9, 2008, at 5:35 AM, Toussaint OTTAVI wrote:

> I agree with you. Parenting / unreachable logic is a very good  
> thing. But I think it should allow to declare a service as a child  
> of its host. This parent/child logic can suppress 'notifications'. I  
> think it could also suppress the display of inaccurate 'status' on  
> the console window.

Our ideas of accuracy would seem to differ ;)

> We do not use email notifications, because we are only 2 guys, and  
> this would generate too much messages.

It shouldn't. In your scenario of 1 host down with X number of  
services on it, you should only receive 1 down message and 1 recovery  
message per host event (unless you want more).

> In fact, parent/child mechanism seems to be the right way to handle  
> hosts located over WANs or routers. In my opinion, it should be  
> possible to consider services as childs of their parent host. This  
> may be a feature request for future versions...

Possibly but with an additional requirement that regularly scheduled  
host checks are enabled for those hosts. Those are still considered  
optional and have been undesirable for all prior versions of nagios  
before current. If someone were to code the patch they would need to  
ensure they were enabled for the hosts with this new feature enabled  
otherwise the host would never be checked and return out of it's  
critical state.

> Following this idea, I will investigate the following :
> - Hosts associated themselves with parent/child relationship  
> according to WAN topology (already working)
> - For each host, I will create a "parent" service with only a  
> check_alive command
> - Every other service will be a child of this parent service

This is promising. http://nagios.sourceforge.net/docs/3_0/objecttricks.html#same_host_dependency 
  will help with the config if you haven't seen it.

>  Am I the only one having this problem ?

I don't consider it a problem myself, just that nagios doesn't work as  
you want it to in your environment. I personally prefer the current  
behavior since it provides more accurate information over a wider  
variety of outage scenarios.

--
Marc


------------------------------------------------------------------------------
SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada.
The future of the web can't happen without you.  Join us at MIX09 to help
pave the way to the Next Web now. Learn more and register at
http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list