Make sockets non-blocking

Andreas Ericsson ae at op5.se
Wed May 5 22:54:48 CEST 2010


On 05/05/2010 03:13 PM, Stephen Gran wrote:
> On Sat, May 01, 2010 at 05:03:59PM +0100, Stephen Gran said:
>> Hi there,
>>
>> We use NDO for network communication with a custom bit of perl to pass
>> status updates around.  Recently, we've seen that a network flap can
>> make ndo hang the entire nagios process, which is possibly imperfect :)
>>
>> I think I've tracked it down to the write() call in io.c when sending
>> the actual update to the remote server.  The attached patch is a
>> relatively naive attempt at making this write nonblocking for network
>> sockets.
>>
>> This is a patch against the CVS - if you prefer a git-am style patch,
>> that's fine.  I tried to clone the git off of sourceforge this morning,
>> and got an empty repo.  If there's a better place to clone from, let me
>> know and I'll fix it up for that.
> 
> So, it turned out my initial attempt to keep the patch small had some
> limitations.  Working patch attached.
> 
> To recap, the main problem is that I/O operations are blocking.  This is
> less important to local file or unix sockets, but can block the main
> nagios process when the I/O operations are tcp based.
> 

It will block on unix sockets too, in case the reader goes to lunch so
the socket buffer fills up.

> My first attempt merely marked the socket as non-blocking, and added
> the optional return code to the list handled in the error path.  What I
> found during testing was that this had a few problems.
> 
> First, the error path adds the return of write() to tbytes.  If write
> returns -1, tbytes was being decremented, resulting in an infinite loop
> because the loop termination condition became unsatisfiable.  Even when
> correcting the return to 0 before addition, there was still no loop
> termination condition when write could not succeed.  I've hackishly
> corrected this with a hardcoded maximum number of loops.
> 
> To make things a little nicer, we don't even want to enter the write()
> loop if we know we can't write().  We do this with a zero second select()
> to check if we can write before entering the loop.  This is admittedly
> racy, but I'd frankly rather return early than block the nagios main
> process.
> 

Humm. I've solved this exact problem in Merlin with a 100 millisecond
timeout and, failing writability on the socket, just referring to a
binary backlog which stashes events for me until I need them.

The binlog api is ridiculously simple and very easy to work with, and
it's separated to its own source + header file. You might want to grab
that instead of hacking around with blocking calls and possibly partial
writes inside the module (which the reader then has to deal with).

> Back to socket creation.  We could mark the socket as non-blocking
> after connect() returns, so that we know that we have a valid fd before
> carrying on.  The problem with this is that the default connect() timeout
> on the Redhat 5 machine I tested this on is 3 minutes.  That is, in my
> opinion, again too long a time to block the main nagios process for.
> 

Definitely. You want to set it to non-blocking, fire off the connect()
and then check if it's writable to see if the connection succeeded. The
polling can be done later in a scheduled event of its own, since the
call will return immediately on a non-blocking socket and therefore
most likely won't have time to even reach the remote end before you
poll it otherwise.

> What I've done instead is to mark the socket as non-blocking before
> calling connect().  In the connect() routine, if connect() sets errno
> to EINPROGRESS, we select() on the socket for 15 seconds to see if it
> succeeds.  If it does not, we enter the normal error path for connect()
> failures.
> 

15 seconds sounds like a lot imo.

> Arguably my choices of 10 loops for termination in write() and 15
> seconds for connect() are not right for everyone.  They could be moved
> to configuration options, or they could be taken from existing options,
> or something.  At this point, it Works For Me(TM), which is good enough
> at this point.  If people have any specific objections, I would of course
> be happy to tidy it up for them.
> 

The connection stuff is totally bearable, but you could easily reduce it
to 2 seconds and still have it Work For You without the massive amounts
of latency it would mean for people who have forgotten to start the
remote end.

The write stuff seems iffier. Although it'll probably work just fine in
practice I get the feeling that bugs in the area will be ridiculously
hard to track down.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------




More information about the Developers mailing list