Problems with many hanging Nagios processes (Nagios spawning rogue nagios processes eventually crashing Nagios server)

linux-system-technik at de.man-mn.com linux-system-technik at de.man-mn.com
Mon Nov 28 15:43:39 CET 2005


Hi everybody,

unfortunately nobody answered to Alex from viveconsulting.co.nz who had a
problem with "Nagios spawning rogue ..." and mailed to nagios mailing list
some months ago. Right now, we have the same problemn very likely he
described in a very detailed way. I tried also a lot of different things
(from configuration changes to tuning issues) to find out the real problem
and I guess the real bottleneck is the pipe used for communication between
Nagios processes. But I found not many reports e.g. emails about this
problem in the web and mail archives.

So why am I writing to list? Maybe someone can give me a hint, how to solve
or workaround that problem? We have 677 services configured and use 350
RRDs. Our Nagios CMS is a PIII 866 MHz with SCSI RAID 5. The system load is
a little bit more than 1.00. As long as we stay below 1.00 no problem, but
otherwise ... (Detailed problem description in Alexs' mail)

This is just our start with Nagios. We want to configure thousands of
services and more than 100 hundred hosts. We would also invest in faster
hardware, dual CPU, 2GB memory and faster SCSI HDDs but is faster hardware
an option? Looking at this issue with the focus on implementation: If the
pipe is the bottleneck it will stay a bottle neck on faster hardware too.
But maybe faster hardware will allow us to configure 3000 services, what
would be enough for the Nagios instance. And then, we deploy another Nagios
instance ...

Any comment would be greatly appreciated. What were your experiences with
Nagios in such an environment and how do you use it today?

Thanks a lot.



Mit freundlichen Grüßen

Tobias Mucke

MAN Nutzfahrzeuge AG
Informationssysteme und Organisation
DV-Technologie und RZ-Betrieb
Linux- System-Technik
linux-system-technik at de.man-mn.com



This message and any attachments are confidential and may be privileged or otherwise protected from disclosure. 
If you are not the intended recipient, please telephone or email the sender and delete this message and any attachment 
from your system. If you are not the intended recipient, you must not copy this message or attachment or disclose the 
contents to any other person.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click




More information about the Developers mailing list