Nagios memory usage 1.2 vs 1.1

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Mon Jun 21 02:35:41 CEST 2004


Dear Ladies and Gentlemen,

I am writing to thank you for your letter and say,

> 
> Message: 5
> Date: Sat, 19 Jun 2004 15:25:55 -0600
> From: Bruce Elrick <bruce at elrick.ca>
> To:  nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Nagios memory usage 1.2 vs 1.1
> 
> Hello,
> 
> Any comments would be appreciated.  I searched the SourceForge archive 
> of this mailing list for "memory leak Perl" etc. and didn't find anything.
> 
> I've got a production Nagios server running 1.1 and a test/dev one 
> upgraded to 1.2 (both using RPM installs running on Redhat 9, host 
> servers are Compaq DL360, 1 CPU 1 GHz, 384 MB RAM, 256 MB swap, RAID 1 
> 36 GB SCSI drives).  Both instances run (nearly) identical check loads.
> 
> The 1.1 instance has a memory SIZE (as reported by 'top') of 780 kB and 
> an RSS of 764 kB.
> 
> The 1.2 instance has a memory SIZE of 134 MB and and RSS of 116 MB.
> 
> Judging by changes we had to make to our Perl-based check scripts (and 
> observing error messages), I can glean that 1.2 runs Perl-based check 
> scripts in a persistent Perl interpreter using ExtUtils::Embed.

It doesn't use ExtUtils::Embed for anything other than compiling 
nagios.c (ie the host code into which Perl is embedded). 

> Anyway, I can understand how this can be much more CPU and I/O efficient 
> than forking a new Perl interpreter on each check.  However, I'm 
> wondering if anyone has a metric as to whether the 134 MB size is 
> reasonable?
>

Probably anything less than infinity is reasonable, depending on run
time.

You can expect that an embedded Perl Nagios (ePN) will if allowed
exhaust your swap and lead the scheduler/vm manager to take whatever
actions it deems necessary to recover.

The FreeBSD 4.x vm system starts killing processes under these 
conditions (usally the largest process gets killed last).

This Nag with a lot of custom Perl checks (194 hosts/388 services) 
starts uses about 100 MB after a months running.
 
> I don't know whether Embed caches the compiled Perl intermediate code 
> for the multiple check scripts that are called or how it manages memory, 
> but even with a half dozen or fewer unique Perl-based check scripts and 
> <100 services, I'm wondering whether 134 MB is reasonable or if there is 
> a memory leak.

The Perl op-code parse tree for each program is retained in memory plus 
all the global variables _and_ all the lexical variables.

There is no memory management by the ePN infrastructure (p1.pl). 
Considerably more sophisticated  code (orders or magnitude so) is 
employed in mod_perl for a similar purpose, but that also leaks.

ePN uses the design described in perldoc perlembed ('Persistent').

> 
> The memory usage started out at 3-4 MB after a few minutes, grew to 32 
> MB overnight, then was at 134 MB after a day or so.  At this point, the 
> response time of the Apache-based Admin Console for Nagios (and other 
> Apache responses) is on the order of 5-10 seconds, presumably because of 
> swaping of memory pages.
> 

I don't see that rate of memory usage.

Tue Jun 15 20:02:22 Nagios 1.2 starting... (PID=39657)
Tue Jun 15 20:02:22 Finished daemonizing... (New PID=39658)

last pid: 32223;  load averages:  0.14,  0.27,  0.31                                                                             
up 39+21:50:34  10:30:48
108 processes: 2 running, 106 sleeping
CPU states: 34.6% user,  0.0% nice, 30.4% system,  1.2% interrupt, 33.9% 
idle
Mem: 78M Active, 89M Inact, 56M Wired, 8704K Cache, 35M Buf, 17M Free
Swap: 256M Total, 20M Used, 236M Free, 7% Inuse

  PID USERNAME       PRI NICE  SIZE    RES STATE    TIME   WCPU    CPU 
COMMAND
39658 nagios          10   0 39048K 20496K nanslp  63:39  0.24%  0.24% 
nagios
32132 nagios          35   0 39048K 21480K RUN      0:00  0.27%  0.05% 
nagios
  861 root             2   0  5548K  3028K select  64:33  0.00%  0.00% 
perl

So this Nag is up to about 40 MB after 6 days.

> Anyone have similar observances or perhaps comments/advice?
>

1 There is some general advice at 
http://nagios.sourceforge.net/docs/1_0/embeddedperl.html

2 Nag version 2 deallocates and reallocates the Perl interpreter 
periodically, thus working around this problem.

3 With Nag 1.x, I work around the problem by manually restarting Nag no 
less than once each four weeks,

This could be automated but since I am using a glib/chained hash Nag 
that doesn't reread config, I restart it each time I change the config 
(prob once a week).

4 It may be worth your posting

- how many hosts/services

- how many Perl plugins

- origin of Perl plugins (ie how many standard [Nag plugins] how many 
in-house)

- Perl version

Lastly, the ePN stuff is not subject to much change. There are some 
fiddling changes in 2.0 but the guts are the same with 1.0, 1.1, 1.2 and 
2.0.

Have you ever had the ePN stuff work Ok ?

> Thanks,
> Bruce Elrick
>

Yours sincerely.

 

-- 
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.


-------------------------------------------------------
This SF.Net email is sponsored by The 2004 JavaOne(SM) Conference
Learn from the experts at JavaOne(SM), Sun's Worldwide Java Developer
Conference, June 28 - July 1 at the Moscone Center in San Francisco, CA
REGISTER AND SAVE! http://java.sun.com/javaone/sf Priority Code NWMGYKND
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list