Possible bug in NSCA

Chris Wilson chris at aidworld.org
Fri Sep 23 12:01:10 CEST 2005


Hi Andreas,

> > I have no idea what caused the first error (no child processes), but the
> > result seems inappropriate. It appears that nsca handles this error as
> > follows, in accept_connection():
> > 
> > 
> >>        /* wait for a connection request */
> >>        while(1){
> >>                new_sd=accept(sock,0,0);
> >>                ...
> >>        }
> >>
> >>        if(new_sd<0){
> >>                ...
> >>                syslog(LOG_ERR,"Network server accept failure (%d: %s)",errno,strerror(errno));
> >>
> >>                /* close socket prior to exiting */
> >>                close(sock);
> >>                return;
> >>                }
> > 
> > 
> > But nsca does not exit: accept_connection is called in an infinite loop,
> > and keeps trying to accept() on a socket that's now closed. 
> > 
> > This seems to be bad behaviour, but I'm not sure what the correct
> > behaviour would be. Any ideas?
> 
> After
> new_sd = accept(sock, 0, 0)
> you should add
> if(new_sd == -1 && errno == EBADF) {
> 	sock = setup_socket();
> }
> 
> Where setup_socket() is an imaginary function that calls socket(), 
> possibly setsockopt(), bind() and listen(), in that order.
> 
> A cleaner solution is to have nsca exit if it can't obtain the socket, 
> since there's no real reason to think it should be able to obtain one later.

Thanks for your comments. Your first solution looks like it might be the
best one, but also the most complex to implement, and I don't feel that
I know nsca well enough to do that.

The second solution is not ideal for me. I think that exiting nsca is
perhaps an overreaction to failure to accept a network connection. True,
it might never be possible to accept one under some circumstances, but
from what happened to me, it seems that random death of nsca might
happen instead.

In my view, nsca did get a listening socket that should work (as long as
errno is not EBADF, which is nsca's own fault for closing the socket).
Replacing close(sock) with sleep(1) would be a more robust solution in
my view, but I'm not sure that it works for every code path through
nsca.

Cheers, Chris.
-- 
(aidworld) chris wilson | chief engineer (chris at aidworld.org)



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. 
Download it for free - -and be entered to win a 42" plasma tv or your very
own Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list