Antwort: Re: Problems with many hanging Nagios processes (Nagios spawning rogue nagios processes eventually crashing Nagios server)

linux-system-technik at de.man-mn.com linux-system-technik at de.man-mn.com
Tue Nov 29 11:10:55 CET 2005


Hi Andi,

thanks for your answer.

Here is the link to Alexs mail.

https://sourceforge.net/mailarchive/forum.php?thread_id=8135931&forum_id=1872

I thought that in Nagios terms CMS means Central Monitoring System?

A kernel recompile is not a problem for me. But I didn't find any setting
called "pipe size" nor even "pipe". Maybe you can give me a hint which
setting I have to change.

Hopefully Ethan let your change in the 2.x release. Would be great. I could
also test it in a massive / debugging way, if you are interested in.

Thanks a lot, again.



Mit freundlichen Grüßen

Tobias Mucke

MAN Nutzfahrzeuge AG
Informationssysteme und Organisation
DV-Technologie und RZ-Betrieb
Linux- System-Technik
linux-system-technik at de.man-mn.com



                                                                           
             Andreas Ericsson                                              
             <ae at op5.se>                                                   
             Gesendet von:                                              An 
             nagios-devel-admi          linux-system-technik at de.man-mn.com 
             n at lists.sourcefor                                       Kopie 
             ge.net                     nagios-devel at lists.sourceforge.net 
                                                                     Thema 
                                        Re: [Nagios-devel] Problems with   
             28.11.2005 17:09           many hanging Nagios processes      
                                        (Nagios spawning rogue nagios      
                                        processes eventually crashing      
                                        Nagios server)                     
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




linux-system-technik at de.man-mn.com wrote:
> Hi everybody,
>
> unfortunately nobody answered to Alex from viveconsulting.co.nz who had a
> problem with "Nagios spawning rogue ..." and mailed to nagios mailing
list
> some months ago.


A link to the mail archives would be helpful.


> Right now, we have the same problemn very likely he
> described in a very detailed way. I tried also a lot of different things
> (from configuration changes to tuning issues) to find out the real
problem
> and I guess the real bottleneck is the pipe used for communication
between
> Nagios processes.


Most likely. It's the only real bottleneck in nagios today, so...


> But I found not many reports e.g. emails about this
> problem in the web and mail archives.
>
> So why am I writing to list? Maybe someone can give me a hint, how to
solve
> or workaround that problem? We have 677 services configured and use 350
> RRDs. Our Nagios CMS is a PIII 866 MHz with SCSI RAID 5. The system load
is
> a little bit more than 1.00. As long as we stay below 1.00 no problem,
but
> otherwise ... (Detailed problem description in Alexs' mail)
>

CMS? Content Management System?
Anyways, 677 services shouldn't be a problem.


> This is just our start with Nagios. We want to configure thousands of
> services and more than 100 hundred hosts. We would also invest in faster
> hardware, dual CPU, 2GB memory and faster SCSI HDDs but is faster
hardware
> an option?

It helps, but not very much I'm afraid. The bottleneck requires a kernel
recompile to be solved on most systems, and that's a very bad thing to
do just to fix this particular problem.

> Looking at this issue with the focus on implementation: If the
> pipe is the bottleneck it will stay a bottle neck on faster hardware too.
> But maybe faster hardware will allow us to configure 3000 services, what
> would be enough for the Nagios instance. And then, we deploy another
Nagios
> instance ...
>

This is definitely a solution. Otherwise you could keep your eyes open
in the somewhat near future for a mail with

[PATCH] checks: Multiplex running checks.

in the topic. I'm working on it right now, but perhaps Ethan won't let
it in for the 2.x branch since it's a fairly massive change.

--
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel




This message and any attachments are confidential and may be privileged or otherwise protected from disclosure. 
If you are not the intended recipient, please telephone or email the sender and delete this message and any attachment 
from your system. If you are not the intended recipient, you must not copy this message or attachment or disclose the 
contents to any other person.


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click




More information about the Developers mailing list