[naemon-dev] Naemon daemon memory utilization

Jason Cook jasonc at liquidgravity.com
Mon Jun 9 14:17:34 CEST 2014


On Jun 9, 2014, at 4:43 AM, Andreas Ericsson <ageric79 at gmail.com> wrote:

> Heya Jason. Long time no see. How's things?
> 
> On 2014-06-05 16:19, Jason Cook wrote:
>> 
>> On May 22, 2014, at 2:42 AM, Sven Nierlein <Sven.Nierlein at Consol.de> wrote:
>> 
>>> On 13/05/14 15:59, Jason Cook wrote:
>>>>>> On 07/05/14 16:37, Jason Cook wrote:
>>>>>>> Yep, the Naemon core process definitely is the one that grows and is 100% reproducible for me. It grows to the max available memory on the box, then gets OOM killed. Doesn't happen when mod_gearman isn't enabled. I've seen it may also be happening with Nagios 4 as well, but haven't tested it myself.
>>>>>>> 
>>>>>>> Test environment is RHEL 6u4.
>>>>>>> 
>>>>>>> The valgrind log wasn't mine, but we seem to have very similar setups.
>>>>>>> 
>>>>>> Could you try the latest version of mod-gearman? I fixed some potential memory leacks which may occure in case
>>>>>> of connection errors.
>>>>>> 
>>>> Looks like it’s still swelling.. after ~19 hours..
>>> 
>>> I found another memory leak. Seems like the way check result were freed has changed, so mod-gearman has to do that by itself now.
>>> Could you try the latest git HEAD of mod-gearman? In my tests, memory usage was constant over the last 12 hours.
>>> 
>>> Sven
>> 
>> Just to follow up on this, it’s a lot better, though still happening (albeit much, much slower)…
>> 
>> nagios   22629  3.6 18.9 2208276 1523904 ?     Ssl  May30 317:59 /usr/bin/naemon -d /etc/naemon/naemon.cfg
>> 
>> After running for nearly a week, it’s at ~1.5GB memory usage… Here it is in a 60 second snapshot..
>> 
>> nagios   22629  3.6 18.9 2208276 1524700 ?     Ssl  May30 318:10 /usr/bin/naemon -d /etc/naemon/naemon.cfg
>> nagios   22629  3.6 18.9 2208276 1524916 ?     Ssl  May30 318:12 /usr/bin/naemon -d /etc/naemon/naemon.cfg
>> 
>> Growing very, very slowly, but still growing.
>> 
> 
> That looks like a small-ish string or a container for something is
> being leaked continuously. Are you using a lot of on-demand macros,
> or custom object variables?
> 
> I'm trying to think of things we may have overlooked when running
> valgrind tests here. Normally, naemon doesn't leak at all, but it
> seems we haven't tested every possible feature in a long-running
> system.
> 
> Worst case scenario, memory is lost due to fragmentation, but it eats
> RAM a little bit too fast for it to be that.
> 
> /Andreas

No on-demand macros or custom object variables - our configs are really, really straight forward. This example is a small-ish config, about 1300 hosts and 11,000 service objects.


More information about the Naemon-dev mailing list