Nagios and PNP Perfomance Issue

Rodney Ramos rodneyra at gmail.com
Wed Feb 3 16:35:23 CET 2010


Hi everybody,

Thank you very much. You are helping me a lot.

I´ve tried to implement the rrdcached and it seems that the PNP performance
problem was solved. The perfdata files are not queuing any more. However I´m
still testing it. Let us known the behavior along the time.

I forgot to say. My polling interval (hosts and services) is 15 minutes.

Andreas, you are right. It´s a big project and we intend in a new future
hire a Nagios Support. Do you recommend one?

While this is not possible, anyone has a suggestion to solve the status.dat
problem?

Just remember, the status.dat is too big (100 MB) and Tactical Overview is
taking a long long time to run.

Thank you very much,
Rodney.

On Wed, Feb 3, 2010 at 6:06 AM, Andreas Ericsson <ae at op5.se> wrote:

> On 02/02/2010 05:47 PM, Rodney Ramos wrote:
> > Hi everybody,
> >
> > I´m using Nagios (3.2.0) to monitoring and colect perfomance data of
> 25.000
> > hosts, with 50.000 services.
> >
>
> That's quite a large environment. I think the brazilian government is
> monitoring a completely huge network as well.
>
> > I have two central machines (one for backup) and 10 distributed servers
> to
> > colect status and send them to the central servers.
> >
> > It´s working but I´m having serious performance problems.
> >
> > First the Tactical Overview on the central machines is taking almost 1
> > minute to refresh. I think that its because the status.dat file is too
> big
> > (almost 100 MB).
> >
>
> I'm not surprised. You'd probably want to get that data into a database to
> get some quick filtering on it.
>
> > Second, the adddon PNP 0.4.14 is taking a long time to process the
> > performance data files. These files are increasing faster than the
> capaciy
> > of process_perfdata.pl script to process them.
> >
>
> I wouldn't use PNP on the same system as such a huge Nagios installation,
> to be honest. A separate system with a flushing/caching daemon piping
> output directly to a single instance of process_perfdata.pl would be far
> better. I don't know if process_perfdata.pl has to be hacked to accept
> input on stdin, but I can't imagine that would be very difficult. Then
> it's just a matter of flushing the performancedata files to that running
> instance. A really small daemon program could easily handle that if the
> performance data processor script just renames the perfdata file to
> something unique.
>
> >
> > Can anyone help me to improve the performance of Nagios and PNP to this
> > enviroment?
> >
>
> Yes, but it sounds like an awful lot of work that I'm not very interested
> in doing for free. You have some pointers now, so try that. If that doesn't
> work, come back here and we'll try something different.
>
> > P.S.: All my Nagios servers are virtual machines with Red Hat. The
> central
> > servers have 2 CPUs and 2 GB of memory. The colectors have 1 CPU and 1 GB
> of
> > RAM. Do you think that change the central servers to physical machine I
> will
> > have a big performance improvement? How much?
> >
>
> Virtual machines have notoriously poor disk performance. Moving it to a
> physical machine will almost certainly remove or widen your current
> bottleneck
> by quite a lot.
>
> > I think that this is a good test for Nagios. I have a demand to put
> 100.000
> > hosts with 200.000 services in this enviroment!!!!. Is it possible? Has
> > someone a Nagios configuration so big?
> >
>
> What matters is how much data per second you intend to process, and how
> many
> checks per minute you intend to run. With a check interval of 6 months, I
> expect Nagios will run just fine with several million service checks
> configured.
> With a check interval of 10 seconds, you'd probably run into problems
> around
> 10000 services.
>
> --
> Andreas Ericsson                   andreas.ericsson at op5.se
> OP5 AB                             www.op5.se
> Tel: +46 8-230225                  Fax: +46 8-230231
>
> Considering the successes of the wars on alcohol, poverty, drugs and
> terror, I think we should give some serious thought to declaring war
> on peace.
>
>
> ------------------------------------------------------------------------------
> The Planet: dedicated and managed hosting, cloud storage, colocation
> Stay online with enterprise data centers and the best network in the
> business
> Choose flexible plans and management services without long-term contracts
> Personal 24x7 support from experience hosting pros just a phone call away.
> http://p.sf.net/sfu/theplanet-com
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20100203/6397f486/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list