Nagios processes hanging around for a long tim e

Nelson, Ben bnelson at rightnow.com
Mon Dec 30 23:37:22 CET 2002


Actually it is all checks.  But, I think I figured out the problem.  All of
these writes were waiting to write to the file descriptor that the master
Nagios process reads for status.  I think that some buffer was filling up
and these children processes were having to wait for the master process to
clean out the FIFO before they could write to it.  By adjusting the
'service_reaper_frequency', I was able to get rid of the problem.
 
Thanks for all of the help from everyone though.
 
--Ben

-----Original Message-----
From: Carroll, Jim P [Contractor] [mailto:jcarro10 at sprintspectrum.com]
Sent: Monday, December 30, 2002 3:11 PM
To: Nelson, Ben; 'nagios-users at lists.sourceforge.net'
Subject: RE: [Nagios-users] Nagios processes hanging around for a long tim e


The only thing which springs to mind is a misconfigured resolv.conf, causing
long delays in resolving hostnames.
 
Do you have this problem when you execute a plugin from the command line?
Does this happen with all plugins, or just network-related ones?
 
jc

-----Original Message-----
From: Nelson, Ben [mailto:bnelson at rightnow.com]
Sent: Monday, December 30, 2002 1:46 PM
To: 'nagios-users at lists.sourceforge.net'
Subject: [Nagios-users] Nagios processes hanging around for a long time



I am currently running Nagios 1.0 and I always seem to have a lot of nagios
processes hanging around after they have done their checks.  As a result,
many service checks take a long time to complete (4-5 minutes in some
cases).

To try and debug this I attached strace to several of them, and got very
similar output every time.  Here is a sample: 
write(7, "servername\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
504) = 504 
------------------ insert long pause here ---------------------------- 
close(7)                                = 0 
_exit(0)                                = ? 

All of the processes that seem to take a long time to complete seem to be
waiting on IO somewhere, since the write call takes quite a while to
complete.  I can't seem to figure out which file they are writing to here.
Does anybody know?

--Ben 

============================================== 
Ben Nelson 
Sr. Systems Administrator, Hosting 
RightNow Technologies (www.rightnow.com) 
A Better Way to Serve Your Customers 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20021230/73b5339a/attachment.html>


More information about the Users mailing list