[PATCH] common/macros.c:2185:grab_standard_servicegroup_macro() speed up & Service check execution problem report

Stephane LAPIE stephane.lapie at darkbsd.org
Tue Jan 4 11:25:07 CET 2011


On 01/04/2011 06:38 PM, Andreas Ericsson wrote:
> http://www.op5.org/community/plugin-inventory/op5-projects/merlin
> http://git.op5.org/git/?p=nagios/merlin.git;a=blob;f=HOWTO;hb=master
> http://git.op5.org/git/?p=nagios/merlin.git;a=blob;f=README;hb=HEAD
> 
> Make especially sure you read the first paragraph of the README.

Oh, I see.

I have been working with Nagios 3.2.0 since around Nov 2009 for our
monitoring setup, so I went and implemented my own thing using ssh keys,
a control shell script with specific commands, tuned configurations to
limit duplication of files, and enhancements to NSCA client/server to
make them workable, and such.

I never tried to touch the DB side of things with a vanilla Nagios base,
because it wouldn't be proper to handle that as a side-hack, and would
most likely kill any measure of performance.

I'll have to give this a look :) Thanks a lot.

> Disable environment macros instead. If you're not using that macro on
> the command-line, your checks will continue to work. It's not a bug in
> Nagios, as such, it's just that environment variables and command line
> shares memory space, and that space is limited. For your 300k+ list of
> servicegroup members, you exhaust that space very quickly, and check
> execution fails.

Oh, so THIS is why in most cases the script would not even be executed.
I would have expected the error to be more straightforward, or have a
hint pointing to it. :)

Anyhow, thanks for the explanation, it now makes perfect sense, I should
have realized environment space was not unlimited. I had never stumbled
upon a case where I used up all of the space provided for ENV before.

>> 2) A performance problem : The MACRO_SERVICEGROUPMEMBERS code is
>> painfully slow and extremely costly in CPU performance. The attached
>> patch file is my attempt at fixing the most obvious issues :
>>   - Repetitive malloc/realloc (I initially caught on this by ktrace-ing
>> the processes and realizing Nagios was mapping/unmapping a lot of memory).
>>   - Repetitive string duplications and length calculations
>>
>> The above code has been tested for a few hours on a busy Nagios setup
>> and performs much faster, as expected. (Reduction of several thousands
>> of malloc/realloc calls to 1, by initally calculating the memory size to
>> be allocated, thus avoiding unneeded system calls and memory areas
>> duplication)
>>
> 
> Nice patch. I'll apply it tomorrow when it's my Nagios day. Any chance
> you could whip up something similar for HOSTGROUPMEMBERS until then?

Sure, please check out the attached file. It works on the same principle
as my previous patch, which means that short of the sprintf() arguments,
it's nearly a copy/paste. I ran it through my configuration for a test
run for an hour or so, and it seems to be doing fine so far.


Again, thanks a lot for your time.
-- 
Stephane LAPIE, EPITA SRS, Promo 2005
"Even when they have digital readouts, I can't understand them."
--MegaTokyo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch-common-macros.c
Type: text/x-csrc
Size: 4743 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20110104/75b65f9e/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 262 bytes
Desc: OpenPGP digital signature
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20110104/75b65f9e/attachment.sig>
-------------- next part --------------
------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list