Was: Nagios arch to improve performance Re: Re: Nagios-devel digest, Vol 1 #807 - 8 msgs

Andreas Ericsson ae at op5.se
Mon May 23 14:14:53 CEST 2005


Stanley Hopcroft wrote:
> Dear Folks,
> 
> I am writing to thank you for an interesting and informative thread
> 
> <pretty useless remarks>
> 
> Nagios scales pretty darn will despite the limitations of fork/exec().
> 
> There is one Nag 2.x user with ~ 15k services, whose check latencies 
> hover around 25 seconds (I have asked permission to release the 
> details and will do so if possible) with an active service check profile 
> of
> 

.au governmental organ? It'd be nice to see hardware specs of that server.

> 1 check_ping                                  @ 5 min  intervals
> 2 custom checks of perf data (RAM/cycles/etc (SNMP + RRD ?)
>                                               @ 15 min intervals
> 3 file system checks (SNMP ?)                 @ 60 min intervals
> 
> Since this number of service checks are handled Ok (in this isolated 
> knowledgable-user case) and those checking much more services are going 
> to be under heavy PHB pressure to buy a name brand product, is this 
> worth pursuing ?
> 

Perhaps. The idea for devicing a different method of multiplexing isn't 
bad provided it
a) scales better (obviously)
b) consumes less resources

It's worth noting that resources can't always successfully be traded for 
speed. One out of two of the above isn't bad, so it should be strived 
for (although not with any great rush considering the current 
implementation works fairly well).

That said, the current bottleneck in Nagios appears to be the fact that 
it runs checks in chunks rather than as standalone units which can be 
picked up as they become elligible for checking. If that little snag 
could be overcome, I'm confident that the aforementioned average check 
latency of 25 seconds could be done away with.

Beyond that, we enter the land of money-for-speed with beefier hardware 
or clustered solutions.


Aside from multiplexing/threading enhancements there is currently plenty 
of room for software improvements in nagios. The strip() function 
implementation is particularly horrible, as is most of the functions 
handling macros and string-in-string substitution.

http://people.redhat.com/drepper/optimtut1.ps gives several detailed 
tricks on optimizing, along with some very neat tricks for helping the 
compiler optimize at its best.


> That said, if my reference implementation of all sorts of useful 
> techniques adds a few more, Good.
> 
> </pretty useless remarks>
> 
> Yours sincerely.
> 

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
This SF.Net email is sponsored by Oracle Space Sweepstakes
Want to be the first software developer in space?
Enter now for the Oracle Space Sweepstakes!
http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click




More information about the Developers mailing list