Bad File Descriptors

Mohr James james.mohr at elaxy.com
Wed Jul 23 15:09:21 CEST 2008


> -----Ursprüngliche Nachricht-----
> Von: nagios-users-bounces at lists.sourceforge.net 
> [mailto:nagios-users-bounces at lists.sourceforge.net] Im 
> Auftrag von Ryan Steele
> Gesendet: Mittwoch, 23. Juli 2008 15:04
> An: nagios-users at lists.sourceforge.net
> Betreff: [Nagios-users] Bad File Descriptors
> 
> Hey folks.
> 
> I recently had a co-worker present me with a problem 
> regarding the NSCA plugin.  It seems that under certain 
> cirumstances (unfortunately, those circumstances are unknown 
> to him and thus me as well), NSCA just kind of hangs (an 
> strace shows basically an idle screen) and these sorts of 
> errors start flooding the daemon log:
> 
> 
> nsca[28640]: Network server accept failure (9: Bad file descriptor)
> 
> 
> The quick fix is to restart NSCA, and then everything hums 
> along until the next incident. 
> 
> 
> It's possible there's a bad block on the disk or something, 
> and an fsck 
> might yield some clues, but I haven't had the chance to schedule 
> downtime to do that yet.  It's also possible it's hitting the 
> fd limit, 
> but in the time I've been monitoring it, I don't see any 
> leaking of fd's 
> that would point to that as a suspect (the limit is the default of 
> 1024).  Additionally, according to ulimit, the pipe size is 4k, which 
> could be an issue as the nsca clients write to a pipe on the server 
> (nagios.cmd), but that's only an option configurable at kernel 
> compile-time and I expect I'd see more widespread reports of problems 
> from other folks in the community if overflowing the default 
> pipe buffer 
> was really the issue.
> 
> I've seen some sparse reports on Google of a similar problem, but 
> they're just that - sparse.   Which kind of makes me think it's not 
> Nagios or NSCA, but a bad block on the hard drive.  Anybody have a 
> similar experience or opinion?

We had similar problems when there were a lot of passiv services and it seemed that NSCA was simply getting overloaded. To be honest, I am not sure that we were getting "Bad file descriptor" but NSCA would not accept any new connection and the solution was to restart it.

Regards,

Jim Mohr 


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list