serial execution of hosts and service checks problem

Scott Behrens behrens at mcs.anl.gov
Wed Mar 30 20:15:25 CEST 2005


So I had a latency problem a while back, and was unable to fix the 
problem.  I just recently set up a test environment with 50 bogus hosts 
executing ping as a service and a hostcheck.  It seems that when a host 
is down the checks do not run parallel and run serial waiting until the 
check times out.  I had similar problems with this other setup which was 
in production:

I am getting roughly 3354 seconds of latency per check and I am not sure 
why.  
Total services:             1812
Total hosts:                  175
Metric 	Min. 	Max. 	Average
Check Execution Time:   	< 1 sec 	6 sec 	0.345 sec
Check Latency: 	2967 sec 	3859 sec 	3748.046 sec
Percent State Change: 	0.00% 	0.00% 	0.00%


I'm mainly concerned with multiple hosts going out in my network and the 
amount of time to complete the checks is extremely delayed.  For 
example, in the above monitoring setup it would take roughly 30 minutes 
to be notified of a service failure...due to the serialization of 
checks.  Does anyone have any suggestions.

-- 
Scott Behrens  
Network and Systems Staff
Room B-244 1-630-252-4198
Mathematics and Computer Science Division
Argonne National Laboratory 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20050330/38c2c18b/attachment.html>


More information about the Users mailing list