Concurrent Service Check Execution

David Knecht david.knecht at anyweb.ch
Sat Sep 8 21:20:34 CEST 2007


I'd like to force Nagios 2.9 to execute all service checks on a given 
monitored system *concurrently* (both in hard OK states as well as in 
hard non-OK states). My goal is to see how my services behave on a 
particular monitored system *at one single point in time*.

Let me clarify this:

Service checks on monitored system A:
Service check cycle n:
Execution of service check A1 ("check process 1"): 00h:00m:00s
Execution of service check A2 ("check process 2"): 00h:00m:00s
Service check cycle n+1:
Execution of service check A1 ("check process 1"): 00h:05m:00s
Execution of service check A2 ("check process 2"): 00h:05m:00s
...

Service checks on monitored system B:
Service check cycle n:
Execution of service check B1 ("check process 1"): 00h:00m:20s
Execution of service check B2 ("check process 2"): 00h:00m:20s
Service check cycle n+1:
Execution of service check B1 ("check process 1"): 00h:05m:20s
Execution of service check B2 ("check process 2"): 00h:05m:20s
...

Service checks on monitored system C:
Service check cycle n:
Execution of service check C1 ("check process 1"): 00h:01m:49s
Execution of service check C2 ("check process 2"): 00h:01m:49s
Service check cycle n+1:
Execution of service check C1 ("check process 1"): 00h:06m:51s
Execution of service check C2 ("check process 2"): 00h:06m:51s
...

--> As can be seen here, service checks A1 and A2 are executed 
concurrently. The same applies to B1/B2 and C1/C2.
--> I doesn't very much matter when these service checks are executed as 
long as they are executed concurrently.

According to http://nagios.sourceforge.net/docs/2_0/checkscheduling.html 
and http://nagios.sourceforge.net/docs/2_0/images/noninterleaved1.png 
non-interleaved checks comes closest to what I want. It seems, though, 
that service check execution gets a bit random the longer Nagios is running:

"Even though service checks are initially scheduled to balance the load 
on both the local and remote hosts, things will eventually give in to 
the ensuing chaos and be a bit random. Reasons for this include the fact 
that services are not all checked at the same interval, some services 
take longer to execute than others, host and/or service problems can 
alter the timing of one or more service checks, etc. At least we try to 
get things off to a good start. Hopefully the initial scheduling will 
keep the load on the local and remote hosts fairly balanced as time goes 
by..."

"Scheduling Delays: It should be noted that service check scheduling and 
execution is done on a best effort basis. Individual service checks are 
considered to be low priority events in Nagios, so they can get delayed 
if high priority events need to be executed. Examples of high priority 
events include log file rotations, external command checks, and service 
reaper events. Additionally, host checks will slow down the execution 
and processing of service checks."

Having mentioned all this I assume that concurrent service checks as 
outlined above cannot be configured in both Nagios 2.9 and 3.0. Do I 
miss anything here? Is there any circumvention?

--> As a circumvention, it might be acceptable if service check A2 gets 
executed ~2-5 seconds after A1. Is it possible to enforce such a behaviour?

Thanks, David


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list