Nagios 3 Performance Monitoring

Hendrik Bäcker andurin at process-zero.de
Mon Nov 12 14:08:44 CET 2007


Hi List,

just want to say: Latest CVS Code performs well.

After the last macro fixing and memory leak fixes my System works well
with all instances running, ePN off, large_installation_tweaks on,
rescheduling window on.

Result:

AVG SVC Latency in worst case on 1.5 seconds, since 25 hours.

My performance graphs are balanced well, memory usage is OK, System load
is between 3.0 and 7.0 (with Nagios 2.x the Load is much higher 12.0 or
15.0).

-
Hendrik

Hendrik Bäcker schrieb:
> Hi List,
> 
> ### Now the complete Mail ###
> 
> since a few days I was testing some performance issues with Nagios 3
> (current CVS Version).
> 
> For nicer graphing I've written a small & dirty Perl script to parse
> some relevant data from the nagiostats binary.
> 
> Output of the plugin is:
> 
> 1. STDOUT: OK - output | perfdata
> 2. (optional) Output + Performancedata printed directly the the external
> command pipe of Nagios.
> 
> I am running a relativ huge installation with up to 5 instances (for
> load balancing) on one hardwareserver (yes - that works).
> 
> Some Backgrounddata:
> 
> Instance 1: 371 / 2156 (Hosts/Services)
> Instance 2: 206 / 1405 (Hosts/Services)
> Instance 3: 381 / 3147 (Hosts/Services)
> Instance 4:   3 /   54 (Hosts/Services)
> Instance 5: 299 / 3233 (Hosts/Services)
> 
> I have enabled the "use_large_installation_tweaks" feature for all
> instance and was realy happy to see that I have _no_ latency at all.
> 
> But after 7-9 hours running time I see that the host/service check
> throuput went down, the host/servicecheck execution time wents up (x2.5)
> and latency comes up too.
> 
> After the beginnings of the latency the graph seems to see no end. It
> goes up to 700 seconds for my fifth instance, I guess it will increase
> if I hadn't restartet the nagios process.
> 
> ############################################################
> If you are interested in, you can see the graphs on:
> 
> http://www.process-zero.de/nagios3/nagiosperformance-20071026-1607.pdf
> 
> The "Plugin" I've written for this is on:
> 
> http://www.my-plugin.de/wiki/doku.php/projects:check_nagios_performance
> 
> (It's not fine enough to be a 'real' Plugin, so there is no reason to
> post it on nagiosexchange.org yet).
> ############################################################
> 
> Back to problem.
> 
> I guess the 'performance trouble' seems to be a 'during runtime'
> problem. So I am looking for some blowing up tasks in the code, my
> actual guess is the update_check_stats() in base/utils.c which es
> executed on every service check und more than one time for every host
> check i think.
> 
> My idea is, that after a while the data structure for stats reaches a
> amount that will take too much time for update and therefor the
> execution time increases.
> Higher exec time leads to less host/service checks leading to more
> latency, but this is just a guess.
> 
> I would like to know what other people thin about this and it would be
> nice if there are other people out there who are able to produce some
> nice graphs about the performance with nagios 3.
> 
> Kind regards,
> 
> Hendrik
> 
> PS. Sorry sorry sorry for my fast fingers on my last try sending to this
> list ;)
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 2185 bytes
Desc: S/MIME Cryptographic Signature
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20071112/11c42b8b/attachment.bin>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list