Memory leak in Nagios head
Andreas Ericsson
ae at op5.se
Tue Nov 30 13:45:23 CET 2004
The "repeated SIGHUP" crash occurs in
find_host(temp_hostextinfo->host_name), called from pre_flight_check().
The attached patch makes nagios at least survive the HUPs (even though
the memory leak is still there, so it should crash eventually when it
hits the memory limit). I haven't tested wether this affects the GUI or not.
Note that it's only tested using Matthews patch as well (which din't fix
the problem), so I don't know if it will work solo or if both of them
have to be combined to do the trick.
Andreas Ericsson wrote:
> Matthew Kent wrote:
>
>> On Mon, 2004-11-29 at 15:34, Andreas Ericsson wrote:
>>
>>> Matthew Kent wrote:
>>>
>>>> Forwarding this on in case anyone else has seen this behaviour and has
>>>> some suggestions. I'll give it a run through valgrind and see if I can
>>>> spot anything this evening.
>>>>
>>>
>>> Thanks, Matt.
>>>
>>> A small update;
>>>
>>> After having run the daemon about 10 hours at a test system, memory
>>> consumption has escalated from roughly 1MB to around 24MB. Not very
>>> nice figures. It seems that sending a HUP makes memory consumption
>>> make a small jump (usually around 20K).
>>
>>
>>
>> Well I may have trapped the HUP problem after some passes through
>> valgrind. Seems reset_variables was getting called twice, right after
>> receiving a sighup and immediately after at the start of the main do()
>> loop in nagios.c
>
>
> I'll get to testing right away.
>
>> I've removed the call to it from cleanup() as it's only called when
>> erroring out anyway, and resetting the variables at this point is a bit
>> of a lost cause ;)
>>
>> I also fixed a couple other minor items reported by valgrind. Although I
>> couldn't figure out this last one
>>
>> 64 bytes in 8 blocks are definitely lost in loss record 66 of 118
>> at 0x1B904EDD: malloc (vg_replace_malloc.c:131)
>> by 0x808F4D4: xodtemplate_add_host_to_hostlist (xodtemplate.c:10665)
>> by 0x808F456: xodtemplate_add_hostgroup_members_to_hostlist
>> (xodtemplate.c:10640)
>> by 0x808EF0E: xodtemplate_expand_hostgroups (xodtemplate.c:10434)
>>
>
> This shouldn't be the longstanding problem though, since NSCORE doesn't
> use xodtemplate_expand_hostgroups() on a regular basis. I'm leaning
> towards a very small and subtle in-struct leak in base/checks.c or
> common/statusdata.c (and their underlying functions, naturally).
> Particularly since the problem seems to present itself more rapidly when
> hosts and services changes status a lot (or possibly just change their
> plugin output).
>
--
Andreas Ericsson andreas.ericsson at op5.se
OP5 AB www.op5.se
Lead Developer
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: nagios-cvs-HUP-crash.diff
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20041130/81cd76bf/attachment.ksh>
More information about the Developers
mailing list