Hi everybody, Thank you very much. You are helping me a lot. I´ve tried to implement the rrdcached and it seems that the PNP performance problem was solved. The perfdata files are not queuing any more. However I´m still testing it. Let us known the behavior along the time. I forgot to say. My polling interval (hosts and services) is 15 minutes. Andreas, you are right. It´s a big project and we intend in a new future hire a Nagios Support. Do you recommend one? While this is not possible, anyone has a suggestion to solve the status.dat problem? Just remember, the status.dat is too big (100 MB) and Tactical Overview is taking a long long time to run. Thank you very much, Rodney. <div class="gmail_quote">On Wed, Feb 3, 2010 at 6:06 AM, Andreas Ericsson <<a href="mailto:ae@op5.se">ae@op5.se</a>> wrote: <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im">On 02/02/2010 05:47 PM, Rodney Ramos wrote: > Hi everybody, > > I´m using Nagios (3.2.0) to monitoring and colect perfomance data of 25.000 > hosts, with 50.000 services. > </div>That's quite a large environment. I think the brazilian government is monitoring a completely huge network as well. <div class="im"> > I have two central machines (one for backup) and 10 distributed servers to > colect status and send them to the central servers. > > It´s working but I´m having serious performance problems. > > First the Tactical Overview on the central machines is taking almost 1 > minute to refresh. I think that its because the status.dat file is too big > (almost 100 MB). > </div>I'm not surprised. You'd probably want to get that data into a database to get some quick filtering on it. <div class="im"> > Second, the adddon PNP 0.4.14 is taking a long time to process the > performance data files. These files are increasing faster than the capaciy > of <a href="http://process_perfdata.pl" target="_blank">process_perfdata.pl</a> script to process them. > </div>I wouldn't use PNP on the same system as such a huge Nagios installation, to be honest. A separate system with a flushing/caching daemon piping output directly to a single instance of <a href="http://process_perfdata.pl" target="_blank">process_perfdata.pl</a> would be far better. I don't know if <a href="http://process_perfdata.pl" target="_blank">process_perfdata.pl</a> has to be hacked to accept input on stdin, but I can't imagine that would be very difficult. Then it's just a matter of flushing the performancedata files to that running instance. A really small daemon program could easily handle that if the performance data processor script just renames the perfdata file to something unique. <div class="im"> > > Can anyone help me to improve the performance of Nagios and PNP to this > enviroment? > </div>Yes, but it sounds like an awful lot of work that I'm not very interested in doing for free. You have some pointers now, so try that. If that doesn't work, come back here and we'll try something different. <div class="im"> > P.S.: All my Nagios servers are virtual machines with Red Hat. The central > servers have 2 CPUs and 2 GB of memory. The colectors have 1 CPU and 1 GB of > RAM. Do you think that change the central servers to physical machine I will > have a big performance improvement? How much? > </div>Virtual machines have notoriously poor disk performance. Moving it to a physical machine will almost certainly remove or widen your current bottleneck by quite a lot. <div class="im"> > I think that this is a good test for Nagios. I have a demand to put 100.000 > hosts with 200.000 services in this enviroment!!!!. Is it possible? Has > someone a Nagios configuration so big? > </div>What matters is how much data per second you intend to process, and how many checks per minute you intend to run. With a check interval of 6 months, I expect Nagios will run just fine with several million service checks configured. With a check interval of 10 seconds, you'd probably run into problems around 10000 services. -- Andreas Ericsson <a href="mailto:andreas.ericsson@op5.se">andreas.ericsson@op5.se</a> OP5 AB <a href="http://www.op5.se" target="_blank">www.op5.se</a> Tel: +46 8-230225 Fax: +46 8-230231 Considering the successes of the wars on alcohol, poverty, drugs and terror, I think we should give some serious thought to declaring war on peace. <div><div></div><div class="h5"> ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. <a href="http://p.sf.net/sfu/theplanet-com" target="_blank">http://p.sf.net/sfu/theplanet-com</a> _______________________________________________ Nagios-devel mailing list <a href="mailto:Nagios-devel@lists.sourceforge.net">Nagios-devel@lists.sourceforge.net</a> <a href="https://lists.sourceforge.net/lists/listinfo/nagios-devel" target="_blank">https://lists.sourceforge.net/lists/listinfo/nagios-devel</a> </div></div></blockquote></div>