BIG problem: Nagios Processes and OCSP

Russell Scibetti russell at quadrix.com
Tue Nov 12 18:34:59 CET 2002


I am having a very unusual problem with Nagios.  I am running a few 
instances of Nagios 1.0b6 on a Linux RedHat7.2 box, one of which has 
about 700+ services, most of which are polled every 5 minutes.  

After a very short amount of time, there are hundred on nagios processes 
hanging around for this instance.  I know that you are supposed to see 
some nagios processes because of the plugin forking, but this is well 
beyond what there should be.  It's around the 200-300 range.  The box 
starts dipping into swap and nagios doesn't get all of the plugins done 
in time, even though the load on the box is still minimal.  Soon the 
plugins are almost 1/2 hour behind.

Now here is the confusing part.  I turned on obsess_over_services for 
the instance and created a command that just logs the results of every 
check to a file.  I wanted to try and see in more detail what was 
happening.  I stopped and started the instance, and now the problem is 
gone.  All the checks are completed at the right time, there is plenty 
of free memory (no swapping) and the process count stays very low.

Is this some form of bug?  What is special with the obsess_over_services 
setting that gets rid of the problem?  If anyone has any idea what could 
be going on, please respond.  Thanks.

-Russell Scibetti

-- 
Russell Scibetti
Quadrix Solutions, Inc.
http://www.quadrix.com
(732) 235-2335, ext. 7038




-------------------------------------------------------
This sf.net email is sponsored by: 
To learn the basics of securing your web site with SSL, 
click here to get a FREE TRIAL of a Thawte Server Certificate: 
http://www.gothawte.com/rd522.html




More information about the Users mailing list