Nagios performance query

Thomas Stocking tstocking at groundworkopensource.com
Wed Nov 21 18:35:31 CET 2007


Carson,
I don't see any other responses to this thread, but after seeing your
talk, I'll chime in.
My experience leads me to believe that:
IF - you only define 3000 or fewer services
    - you set service_inter_check_delay_method=s
    - you set service and host check timeouts to 10 seconds
    - other settings as you have them here
Then....
You can expect to process about 500 active checks per minute, or 8.333
per second. That means a 6 minute service check frequency is about all
you can expect. We use 10 minutes by default.
Average service check latency should stay down to 10 seconds or less, in
this situation.
Caveats, of course:
Are there active host checks going on, or is the system steady state?
Are you doing anything with performance data?
What else is running on the machine?
What is the hardware (Not that this seems to matter as much as it should
with Nagios)?

All these can affect throughput and latency.

We are working on this problem and testing several enhancements to the
GroundWork packaging of Nagios that we plan to release in the next few
months. We are also reviewing the Nagios 3 code and contributing
everything we find that could help.

Your talk was insightful, and inspired our engineers to look at some of
your methods. Nice work.
    Thomas


Carson Gaspar wrote:
> (Disclosure - I'm giving an invited talk at LISA next week on Nagios)
>
> I'm trying to determine the maximum active check rate that Nagios can 
> sustain on a given system. I'm rather surprised that I can't seem to get 
> more than about 3 checks per second. Am I just being dense? (I haven't 
> used anything but passive checks for years now, so my active check foo 
> is stale...)
>
> Nagios config settings (testing with 2.9, which is what we're using in 
> production):
>
> service_inter_check_delay_method=0.01 (tried s as well...)
> service_interleave_factor=s
> host_inter_check_delay_methos=0.01 (tried s as well)
> max_concurrent_checks=0
> service_reaper_frequency=1
> sleep_time=0.01
>
> performance stats show svc execution at 0 / .8 / .459, svc latency at 0 
> / 10467.99 / 4475.5599 (3 hours after launch, with only 32708 / 50000 
> services checked)
>
> I never see more than 9 nagios procs at once. The config is large (5000 
> hosts with 10 services each). The service checks are a simple c program 
> that sleeps for half a second then returns OK.
>
> Is this really the best Nagios can do, or am I missing something?
>
>   

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list