Nagios memory Leaks

Stanley.Hopcroft at Dest.gov.au Stanley.Hopcroft at Dest.gov.au
Wed Jan 24 01:07:00 CET 2007


Dear Sir,

I am writing to thank you for your valuable letter and say, 

 
> From: Tobias Klausmann <klausman at schwarzvogel.de>
> Subject: [Nagios-users] Memory leaks
> 
> Hi! 
> 
> (First off: if this should also go to nagios-devel, just yell at
>  me.)
>

I don't think so because it deals with the aspects of the implementation
that
are visible (and in fact, the letter doesn't propose detailed
solutions).
 
> Nagios 2.6 and 2.5 have memory leaks. They are not that big that
> within hours your machine will be swapping, but they degrade
> performance in other ways.
> 
> First off, their approximate extent.
> 
> 2.5 and 2.6 without perl cache have the smallest memory leaks. A
> fairly busy Nagios server (hardware quoted below) with about 3000
> services on about 330 hosts will degrade from 330M used (that's
> *not* Nagios alone) to 368M used in about 16 hours. Or about 2.4
> MB per hour. The very same machine behaves neutral if Nagios is
> not running, so it's definitely Nagios itself.

Do you mean: 2.5 and 2.6 Nagios with embedded Perl but without the
Perl plugin cache option ?

If so, the fault is not Nagios, but the embedded Perl implementation and
or Perl.

Your next paragraph suggest that this is plain vanilla Nagios without
any Perl options to configure.

Is that correct ?

> 
> Activating the embedded Perl interpreter and -cache will increase
> the amount of lost memory to about 5-6M per hour. In this case,
> however, sometimes the memory usage snaps back, i.e. some of the
> lost memory is collected. I've not yet found out what triggers
> the reclaim. Still, over the course of hours, more and more
> memory is lost. Still, it's roughly linear memory loss.
> 

I have never witnessed memory being reclaimed after ePN leaks it.

I can't conceive of the process memory size being reduced while the
process is running (free() and friends only return the memory to the
process
heap).

I think the leak is caused by the ePN implementation. I a hoping to
trying
some measurements with several pilot implementations to see what is the
most 
promising way of doing this.

... (snip)

Yep. I agree. The leak is bad.


> The question that remains is, if this can (and will) be tackled
> before 3.0 is released. A related question is if Nagios 3 will be
> prone to the same problem.
> 

Certainly it will if the current ePN implementation remains.

If (pretty big if) I can provide you stuff to try are you willing
to repeat your measurements on candidate implementations (wrt 2.5 or
2.6 code base) ?

I am not sure of my willingness/energy quotient but if they look Ok,
I may not have anything to show until March this year.

> Any thoughts, ideas etc. are appreciated.
> 
> Regards,
> Tobias

Yours sincerely.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list