RFC/RFP Nagios command workers

Andreas Ericsson ae at op5.se
Wed Jun 29 10:50:48 CEST 2011


On 06/28/2011 05:13 PM, Matthieu Kermagoret wrote:
> Hi list,
> 
> First of all, sorry for the delayed response, last month was pretty
> crazy at work :-p
> 
> On Mon, May 23, 2011 at 12:38 PM, Andreas Ericsson<ae at op5.se>  wrote:
>> On 05/23/2011 11:37 AM, Matthieu Kermagoret wrote:
>> Because shipping an official module that does it would mean not only
>> supporting the old complexity, but also the new one. Having a single
>> default system for running checks would definitely be preferrable to
>> supporting multiple ones.
>>
> 
> I agree with you when you say that a single system is better than two.
> However I fear that the worker system would need very more code than a
> simpler system (and less code usually means less bugs) and that the
> worker system would destabilize Nagios.

Quite the opposite, really. The amount of backflips we're doing right
now to make sure the core is threadsafe is huge, so it's likely this
patch will even reduce the LoC count in Nagios.

> For years it's been Nagios'
> development team's policy not to include features that could be
> written as modules. I liked it that way.
> 

Everything can be written as modules. The worker process thing will have
the nice sideeffect that modules can register sockets that core Nagios
will listen to events from, with a special callback when there's data
available on the socket. This reduces complexity of a lot of modules by
a fair bit. With worker-processes instead of multiple threads it's also
trivial to write modules with regards to thread-safety, and potential
leaks in worker modules (such as embedded perl) can be ignored, since
we can just kill the worker process and spawn a new one once it's done
some arbitrary number of checks. This is how Apache handles leaky
modules and we could do far worse than using the world's most popular
webserver as an example.

There's also another thing. Mozilla Firefox has been accused of feature
stagnation in the core since they let addon writers handle adding new
features, and far from everybody uses modules. Google Chrome has taken
a fair share of users from Firefox lately, partly because it implements
some of the more popular modules directly in-core. Nagios has also been
accused of feature stagnation, even though broker module development
has flourished in recent years (nagios with modules is nothing like the
old nagios without them), so it makes sense to add certain selected
module capabilities to the core.

>>> 1) Remove the multiple fork system to execute a command. The Nagios
>>> Core process forks directly the process that will exec the command
>>> (more or less sh's parsing of command line, don't really know if this
>>> could/should be integreted in the Core).
>>>
>>
>> This really can't be done without using multiple threads since the
>> core can't wait() for input and children while at the same time
>> issuing select() calls to multiplex the new output of currently
>> running checks.
>>
> 
> What about a signal handler on SIGCHLD that would wait() terminated
> process and a select() on pipe FDs connected to child processes, with
> a timeout to kill non-responding checks ?
> 

Highly impractical for shortlived children and with so many pipes to
listen to. It would mean we'd be iterating over the entire childstack
several hundred times per second just to read new output. We're forced
to do that, since pipes can't contain an infinite amount of data. The
child's write() call will fail when the pipe is full and the children
won't exit while waiting to write. Doing so many select() calls means
the scheduler will suffer greatly, along with modules that wish to run
code in the main thread every now and then.

With sockets, we can let each worker handle a smaller number of checks
at the time, and since they have no scheduling responsibilities the
master process is free to just await new input.

>>> 2) The root process and the subprocess are connected with a pipe() so
>>> that the command output can be fetched by reading the pipe. Nagios
>>> will maintain a list of currently running commands.
>>>
>>
>> Pipes are limited in that they only guarantee 512 bytes of atomic
>> writes and reads. TCP sockets don't have this problem. There's also
> 
> It is my understanding of Posix that the core standard defines a
> 512-byte minimal limit for atomic I/O operations but I cannot find any
> section enforcing atomicity on I/O operations on TCP sockets, so pipes
> would be better indeed. Were you refering to the XSI Streams or could
> you point me to the appropriate section ?
> 

No. TCP sockets don't enforce atomicity beyond the 512 bytes already
specified, but they do enforce ordering, which pipes don't. This is
actually a real problem (although an unusual one) when several processes
tries to write data to Nagios' command pipe and one of them writes
more than the atomic limit on whatever system it's being written on.
The fact that pipes use fixed-size buffers for pipes (requiring a full
kernel recompile to change) and the fact that a program can change the
size of its socket buffers with a simple system call makes sockets an
even bigger winner.

>> the fact that a lot of modules already use sockets, so we can get
>> rid of a lot of code in those modules and let them re-use Nagios'
>> main select() loop and get inbound events on "their" sockets as a
>> broker callback event. Much neater that way.
>>
> 
> A pretty API would definitely be great, no doubt.
> 
>>> 3) The event loop will multiplex processes' I/O and process them as necessary.
>>>
>>
>> That's what the worker processes will do and then feed the results
>> back to the nagios core through the sequential socket, which will
>> guarantee read and write operations large enough to never truncate
>> any of the data necessary for the master process to do proper book-
>> keeping.
>>
> 
> I'm not very fond of the "large buffer" approach because I'm not
> really sure that I/O operations are atomic (see above).
> 

They needn't be atomic so long as they don't stall the writer or the
reader and maintain ordering. TCP ensures ordering and select() or
ioctl() calls ensure non-stalling behaviour. UDP sockets would have
more or less the same problem as pipes, but TCP avoids them completely
at the expense of slightly more overhead. For local TCP operations the
overhead is comparable to one set operation and one if() statement
inside the kernel code and doesn't even require a context switch.

>>> The worker system could still be implemented and used only by users
>>> who need it (but that's what DNX and mod_gearman do). I believe it is
>>> better to leave the default command execution system as simple as it
>>> is right now (but improve it) and leave distribution algorithms to
>>> modules. I can imagine multiple reasons for which one would want to
>>> distribute checks among workers :
>>>
>>
>> The only direction we can improve it is to remove it and rebuild it.
>> Removing one fork() call would mean implementing the multiplexer
>> anyway, and once that's done we're 95% done with the worker process
>> code anyways.
>>
> 
> I guess that you shouldn't be far from a complete worker system but
> (again :-) ) writing it as a module wouldn't be difficult either.
> 

Not very, no. I've got some proof-of-concept code that does pretty
well. One of the best things with the worker-process thing is that
one can run modules or even external programs to execute the checks
in place of in-core code. That will serve as an excellent testing
ground for new algorithms without requiring huge patches to Nagios.

I digress though, but letting modules handle core functionality
would imo be a very good idea indeed. That way we can let hackers
test new core functionality as modules and the most efficient one
gets implemented in the core to become default eventually. Organic
software development with all the benefits of evolution. Since each
subsystem should be more or less self-contained and dependant only
on a few core API's anyway, there's no reason replacing one should
even be difficult.

>>> So in fact you plan removing the old FIFO and doing all stuffs through
>>> the socket ? What about acknowledgements or downtimes ? Could they be
>>> sent through the socket too or would there be another system ?
>>>
>>
>> They could be sent through the socket, but for a while at least we'll
>> have to support both pipe and socket, so addon developers have some
>> time to adjust to the new world order.
>>
> 
> All right, thanks for the explanations. A status returned by Nagios
> after any external command execution would be a nice feature indeed.
> 

Yes, it's one of the most liked features of mk-livestatus.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2




More information about the Developers mailing list