Memory leak

Andreas Ericsson ae at op5.se
Mon Apr 4 19:54:16 CEST 2005


Arno Lehmann wrote:
> Hi.
> 
> Andreas Ericsson wrote:
> 
>> The kernel uses memory, and most os's implement copy-on-write with 
>> forked processes (Linux does this, and judging by the apps running 
>> that's what you're using). That means only changed frames are actually 
>> copied on a fork(), but the theoretical maximum consumption (as 
>> determined by allocated buffers in the master process) is displayed 
>> anyways.
> 
> 
> Errm - sure. Anyway, what I see is that the memory claimed by processes 
> is far less than what the kernel says is used.
> 

This is because free and friends show what's available to a program 
running on the system. Removed from that pool is memory hogged by 
graphic drivers that shadow ram, and the kernels own memory. Large 
routing tables, software raid and stateful in-kernel firewalls are three 
of the most common causes for "disappearing" memory. If nagios had had a 
leak it's process size would grow abnormally and most likely fairly 
rapidly. In short, memory wouldn't be "missing", it would be assigned to 
a process that usually doesn't claim that much of it.

>>> Any other ideas?
>>>
>>
>> Run it through valgrind and log everything. Post the logs on some 
>> public webpage so users with little or no interest doesn't have to 
>> cope with them on the list.
> 
> 
> Doing it just now... wait some time, and I'll post the URL.
> 

Excellent.

> One question, though:
> I get output like the following
> 
>> ==30154== Syscall param socketcall.sendto(msg) contains uninitialised 
>> or unaddressable byte(s)
>> ==30154==    at 0x1BA4A4E1: sendto (in /lib/tls/libc.so.6)
>> ==30154==    by 0x1BA33FB6: getaddrinfo (in /lib/tls/libc.so.6)
>> ==30154==    by 0x1BC00521: ldap_connect_to_host (in 
>> /usr/lib/libldap-2.2.so.7.0.8)
>> ==30154==    by 0x1BBEACDC: ldap_int_open_connection (in 
>> /usr/lib/libldap-2.2.so.7.0.8)
>> ==30154==  Address 0x52BFD07D is on thread 1's stack
>> Nagios 2.0b2 starting... (PID=30154)
>> ==30160==
>> ==30160== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 41 from 1)
>> ==30160== malloc/free: in use at exit: 1250742 bytes in 122 blocks.
>> ==30160== malloc/free: 17115 allocs, 16993 frees, 2215553 bytes 
>> allocated.
>> ==30160== For counts of detected errors, rerun with: -v
>> ==30160== searching for pointers to 122 not-freed blocks.
>> ==30158==
>> ==30158== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 41 from 1)
>> ==30158== malloc/free: in use at exit: 1250742 bytes in 122 blocks.
>> ==30158== malloc/free: 17146 allocs, 17024 frees, 2215788 bytes 
>> allocated.
>> ==30158== For counts of detected errors, rerun with: -v
>> ==30158== searching for pointers to 122 not-freed blocks.
>> ==30160== checked 2197636 bytes.
>> ==30160==
>> ==30160==
>> ==30160== 8 bytes in 2 blocks are definitely lost in loss record 1 of 18
>> ==30160==    at 0x1B903BAC: malloc (in 
>> /usr/lib/valgrind/vgpreload_memcheck.so)
>> ==30160==    by 0x1B8E9F5E: _dl_map_object_from_fd (in /lib/ld-2.3.3.so)
>> ==30160==    by 0x1B8EACC9: _dl_map_object (in /lib/ld-2.3.3.so)
>> ==30160==    by 0x1B8F09CD: openaux (in /lib/ld-2.3.3.so)
>> ==30160==
>> ==30160==
>> ==30160== 37 bytes in 2 blocks are definitely lost in loss record 4 of 18
>> ==30160==    at 0x1B903BAC: malloc (in 
>> /usr/lib/valgrind/vgpreload_memcheck.so)
>> ==30160==    by 0x1B9F7CAF: strdup (in /lib/tls/libc.so.6)
>> ==30160==    by 0x807753E: add_host_notification_command_to_contact 
>> (objects.c:2465)
>> ==30160==    by 0x8084C95: xodtemplate_register_contact 
>> (xodtemplate.c:7800)
>> ==30160==
>> ==30160==
>> ==30160== 41 bytes in 2 blocks are definitely lost in loss record 5 of 18
>> ==30160==    at 0x1B903BAC: malloc (in 
>> /usr/lib/valgrind/vgpreload_memcheck.so)
>> ==30160==    by 0x1BC1D900: ???
>> ==30160==    by 0x1BC1DA58: ???
>> ==30160==    by 0x1BC05149: ???
> 
> 
> The last block contains addresses, but not code lines. Is that normal?

Yes. It happens whenever the eip enters a library that hasn't got any 
debug symbols, or if the binary is stripped and you don't have a symbol 
table to load in to valgrind (you need to get the symbol table *before* 
stripping for valgrind to be able to use it).

> I 
> assume that's kernel space, but I'm not sure about anything - valgrinds 
> output is quite crypic to me. Above, I have the code lines and function 
> names.
> 

Kernel space doesn't have debug symbols attached, ofcourse, so that 
could be it.

> Arno
> 
>>> Arno
>>>
>>
> 

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list