Too many service dependencies - pre-filght check stalled

Andreas Ericsson ae at op5.se
Fri Sep 30 15:21:36 CEST 2011


On 09/30/2011 01:40 AM, Mohit Chawla wrote:
> Hello,
> 
> On Fri, Sep 30, 2011 at 3:06 AM, Andreas Ericsson<ae at op5.se>  wrote:
> 
>> A) Implement the algorithm for dependencies that Jean Gabès hacked up for
>> parent/child detection.
>>
>> B) Implement service parents. It's one of the (not that many) TODO items
>> for Nagios 4. That way you'd get the above check logic for free.
>>
>> C) Rework how dependencies are organized in Nagios core memory and how
>> they link to other objects. That patch would have to go in Nagios 4 as
>> well, but I wouldn't be opposed to it.
>>
>> D) Remove the service dependencies which are for ping. Especially if you
>> use ICMP (ping) for checking if the host is up, since you're otherwise
>> using redundant dependency logic.
> 
> But generally, to narrow down the problem, is this really just a
> simple case of too many dependencies, or would there be any further
> investigation patterns I should be exploring ?

It's not a case of too many dependencies. It's a case of having far more
dependencies than the code was originally tested for, so you're running
into a problem with an algorithm that has a high asymptotic complexity.

Changing the algorithm is the Right Way(tm), but removing the thousands
of servicedependencies that cause services to depend on the ping check
on the same host will make the problem a lot smaller. Going from 90000
to 50000 dependencies won't reduce cpu time by half, but will most
likely cut it down by a factor of 10, or even 100.

When you do look into the suggestions, start with removing dependencies
as that's by far the cheapest trick (even though it's the least effective).

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2dcopy2
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list