Nagios active check performance limits

Gaspar, Carson Carson.Gaspar at gs.com
Fri Nov 9 20:46:55 CET 2007


I sent this to -users earlier, but it's probably more appropriate for
-devel.

FYI I just re-ran my tests with 3.0b6 and am seeing about the same results. 

(Disclosure - I'm giving an invited talk at LISA next week on Nagios)

I'm trying to determine the maximum active check rate that Nagios can
sustain on a given system. I'm rather surprised that I can't seem to get
more than about 3 checks per second. Am I just being dense? (I haven't used
anything but passive checks for years now, so my active check foo is
stale...)

Nagios config settings (testing with 2.9, which is what we're using in
production):

service_inter_check_delay_method=0.01 (tried s as well...)
service_interleave_factor=s
host_inter_check_delay_methos=0.01 (tried s as well) max_concurrent_checks=0
service_reaper_frequency=1
sleep_time=0.01

performance stats show svc execution at 0 / .8 / .459, svc latency at 0 /
10467.99 / 4475.5599 (3 hours after launch, with only 32708 / 50000 services
checked)

I never see more than 9 nagios procs at once. The config is large (5000
hosts with 10 services each). The service checks are a simple c program that
sleeps for half a second then returns OK.

Is this really the best Nagios can do, or am I missing something?

--
Carson

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/




More information about the Developers mailing list