novel idea for performance optimization

Andreas Ericsson ae at op5.se
Sun May 8 23:34:09 CEST 2005


sean finney wrote:
> hey folks,
> 
> (cross-posting to nagiosplug mailing list too)
> 

I'm cutting nagiosplug-devel out, since there are some issues with this 
that needs to be considered in-core first.

> one of the most notable performance hits on a nagios server is the huge
> overhead from having to fork/exec a seperate binary for every check (and
> many of these binaries go on to fork/exec or popen yet another system
> call).  to the passing pessimist, this might be considered an unavoidable
> hit because of how nagios works (with the plugins being a bunch of
> seperate executable binaries).  however, i think i see a rather novel
> way around the problem.
> 
> with a little ld linker voodoo, you could build a shared library version
> of each plugin, where main() was renamed something identifiable to the
> plugins.  
> 

Good idea, except that ld linker voodoo (symbol resolution et al) 
induces the same or more overhead on systems with copy-on-write fork 
(linux, bsd, solaris) and reasonably quick context-switching (linux, 
bsd). So the suffering people are those running Nagios on HP and Cygwin. 
Not a great many, I presume.

> then in nagios, when it goes to execute a plugin, it first would check
> to see if such a shared object existed.  if it doesn't, it executes the
> binary just like it normally would, and if it does, it instead
> dlopen()'s the library, and calls the function.  repeated calls to
> the check would therefore have very little overhead, as not only would
> the fork/exec be avoided, but the library would already be loaded and
> resident in memory.
> 

This is moot. All operating systems worth their salt caches frequently 
accessed programs so the code is already in memory anyway.

> the one problems i see with this are that many of the plugins have less
> than stellar memory management, and some might need some work to be
> thread safe (and some of course are not executable binaries at all, but
> instead perl/shell/whatever scripts).
> 

They would also have to add some code that splits arguments the way they 
are supposed to, including some other additional stuff.

> so, what do you think?
> 

I think a better approach would be to fire up a dozen or so 
command-running threads which picks checks off a queue, runs them, and 
writes the results back to an equally simple but efficient "parse-me" 
queue. The trouble here is that misbehaving plugins could actually hang 
Nagios (signal handling and threads just doesn't work in this scenario) 
if they are written poorly enough. This could be worked around fairly 
easily by doing fdopen() on the FILE-pointer returned by popen() and 
doing some select on it.

Come to think of it, it should actually be possible to multiplex a 
fairly large number of checks from a single thread using fdopen(3) and 
select(2). This would let one get rid of locking problems with a 
multi-threaded approach and would probably be highly efficient since the 
fork()'ed children just sit around and wait all the time anyway.

I'll experiment a bit with this and see where it leads.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
This SF.Net email is sponsored by: NEC IT Guy Games.
Get your fingers limbered up and give it your best shot. 4 great events, 4
opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20




More information about the Developers mailing list