Possible Memory Leak in Nagios?

Sean Elble selble at higherone.com
Mon Mar 28 15:09:49 CEST 2011


Hi all,

I recently built a new Nagios system running CentOS 5.5 (as up-to-date as can be), and I'm finding a strange issue with memory usage on the system (that didn't exist prior to putting the system live).  Despite no resident processes taking up much in the way of RAM (~200 MB, give or take at any one time, and it's consistent from hour-to-hour and day-to-day), the memory usage excluding buffers and cache continues to grow by a significant amount on a daily basis:

[user at nagios ~]# ps -eo rss,size,pid,cmd | awk '{ SUM += $1 } END { print SUM/1024 }'; free -m
195.863
             total       used       free     shared    buffers     cached
Mem:          3939       3629        310          0        170        677
-/+ buffers/cache:       2781       1158
Swap:         5951          0       5951

(Yes, I know the result from ps won't match free, but the discrepancy between the two keeps increasing at a consistent rate)

It's been a consistent 300-400 MB per day increase in usage for no apparent reason, and the only reason I bring it up here is because the system's memory usage pattern was quite normal prior to Nagios running on it.

I compiled Nagios (3.2.3) and the Nagios plugins (1.4.15) from source, with no special options for either:

configure:9208: result: *** Configuration summary for nagios 3.2.3 10-03-2010 ***:
configure:9215: result:         Nagios executable:  nagios
configure:9217: result:         Nagios user/group:  nagios,nagios
configure:9219: result:        Command user/group:  nagios,nagios
configure:9230: result:             Embedded Perl:  no
configure:9234: result:              Event Broker:  yes
configure:9240: result:         Install ${prefix}:  /usr/local/nagios
configure:9242: result:                 Lock file:  ${prefix}/var/nagios.lock
configure:9244: result:    Check result directory:  ${prefix}/var/spool/checkresults
configure:9246: result:            Init directory:  /etc/rc.d/init.d
configure:9248: result:   Apache conf.d directory:  /etc/httpd/conf.d
configure:9250: result:              Mail program:  /bin/mail
configure:9252: result:                   Host OS:  linux-gnu
configure:9259: result:                  HTML URL:  http://localhost/nagios/
configure:9261: result:                   CGI URL:  http://localhost/nagios/cgi-bin/
configure:9263: result:  Traceroute (used by WAP):  /bin/traceroute

This thread <http://www.mail-archive.com/nagios-users@lists.sourceforge.net/msg25842.html> inspired me to check on check_snmp using Valgrind, and while it does appear to be leaking memory (~500 bytes per run), I'm wondering if anyone else has seen this behavior before, given that I'm not using the embedded Perl option that the OP in that thread was using (though comments there make me wonder if it's worth trying).

If anyone has seen this before, or has any thoughts on what the issue might be, I'd certainly appreciate it.

Thanks,

--
Sean Elble
Linux Systems Administrator
Higher One, Inc.

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software 
be a part of the solution? Download the Intel(R) Manageability Checker 
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list