Nagios active check performance limits

Ethan Galstad nagios at nagios.org
Sun Nov 11 00:37:18 CET 2007


Gaspar, Carson wrote:
> After sending this I discovered the joy of the new 3.x parameter
> use_large_installation_tweaks...
> 
> Nagios is now eating all 4 of my processors, as requested. I'm not sure
> _why_, as without this setting my server was practically idle...

How many checks are you executing per second to keep a 4 processors 
warm?  Any details on the hardware you provide us as well?  I'm 
interested to see how performance is on different systems.

BTW, you can also (as of 3.0b6) disable a new 
"enable_environment_macros" option if you don't reference env. vars to 
get macro values in your scripts.  Its a huge waste of time for most people.

> 
> I'm now seeing about 13.5 checks/sec, much closer to what I expected.
> 
> Of course passive checks are processed at 1052/sec, at <25% CPU (which is
> why I use them!)
> 
>> -----Original Message-----
>> From: nagios-devel-bounces at lists.sourceforge.net 
>> [mailto:nagios-devel-bounces at lists.sourceforge.net] On Behalf 
>> Of Gaspar, Carson
>> Sent: Friday, November 09, 2007 11:47 AM
>> To: 'nagios-devel at lists.sourceforge.net'
>> Subject: SPAM WARNING!: [Nagios-devel] Nagios active check 
>> performance limits
>>
>> I sent this to -users earlier, but it's probably more 
>> appropriate for -devel.
>>
>> FYI I just re-ran my tests with 3.0b6 and am seeing about the 
>> same results. 
>>
>> (Disclosure - I'm giving an invited talk at LISA next week on Nagios)
>>
>> I'm trying to determine the maximum active check rate that 
>> Nagios can sustain on a given system. I'm rather surprised 
>> that I can't seem to get more than about 3 checks per second. 
>> Am I just being dense? (I haven't used anything but passive 
>> checks for years now, so my active check foo is
>> stale...)
>>
>> Nagios config settings (testing with 2.9, which is what we're using in
>> production):
>>
>> service_inter_check_delay_method=0.01 (tried s as well...) 
>> service_interleave_factor=s
>> host_inter_check_delay_methos=0.01 (tried s as well) 
>> max_concurrent_checks=0
>> service_reaper_frequency=1
>> sleep_time=0.01
>>
>> performance stats show svc execution at 0 / .8 / .459, svc 
>> latency at 0 /
>> 10467.99 / 4475.5599 (3 hours after launch, with only 32708 / 
>> 50000 services
>> checked)
>>
>> I never see more than 9 nagios procs at once. The config is 
>> large (5000 hosts with 10 services each). The service checks 
>> are a simple c program that sleeps for half a second then returns OK.
>>
>> Is this really the best Nagios can do, or am I missing something?
>>
>> --
>> Carson


Ethan Galstad
Nagios Developer
___
Email: nagios at nagios.org
Web:   www.nagios.org

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/




More information about the Developers mailing list