latency problem

Olivier JAN ojan at gfi.fr
Thu Sep 25 10:06:24 CEST 2008


Hi list,

I get some latency problems i can't explain. Here's the story.

Nagios 3.0.3 on Ubuntu 8.0.4. Hardware is an intel quad core with 10  
Go Ram and fast disks. I get 24524 services on 1654 hosts to check.  
Services are mostly active-passive with a check intervall of 6 hours.  
check_intervall for hosts is 0 so Nagios make them only on demand. No  
broker module activated. ocsp and ochp are activated because this  
server is part of a distributed system. Nagios debug is activated.  
Some configuration options i have.

service_inter_check_delay_method=s
service_interleave_factor=s
host_inter_check_delay_method=s
max_concurrent_checks=0
max_service_check_spread=240
check_result_reaper_frequency=2
max_check_result_reaper_time=30

I tried to "play" with those options without success. Latency keeps  
growing whatever i tried. What is strange is the fact that in the  
performance screen i see that
Metric	Min.	Max.	Average
Check Execution Time:  	0.00 sec	15.01 sec	0.888 sec
Check Latency:	0.00 sec	10191.24 sec	4924.060 sec
Percent State Change:	0.00%	18.36%	0.24%

and that in the scheduling queue, i see that at 9:45.
SERVER	PRINT_ERROR	25-09-2008 03:45:32	25-09-2008 09:45:32	Normal  
	ENABLED	Disable Active Checks Of This Service Re-schedule This  
Service Check

It seems that services are corrrectly scheduled despite the latency i  
see in the performance screen.
/usr/local/nagios/bin/nagios -s /usr/local/nagios/etc/nagios.cfg tells  
me that everything is fine and have no suggestion for me.

HOST SCHEDULING INFORMATION
---------------------------
Total hosts:                     1654
Total scheduled hosts:           0
Host inter-check delay method:   SMART
Average host check interval:     0.00 sec
Host inter-check delay:          0.00 sec
Max host check spread:           360 min
First scheduled check:           N/A
Last scheduled check:            N/A


SERVICE SCHEDULING INFORMATION
-------------------------------
Total services:                     24524
Total scheduled services:           23739
Service inter-check delay method:   SMART
Average service check interval:     17331.08 sec
Inter-check delay:                  0.61 sec
Interleave factor method:           SMART
Average services per host:          14.83
Service interleave factor:          15
Max service check spread:           240 min
First scheduled check:              Thu Sep 25 10:07:13 2008
Last scheduled check:               Thu Sep 25 14:07:15 2008


CHECK PROCESSING INFORMATION
----------------------------
Check result reaper interval:       2 sec
Max concurrent service checks:      Unlimited


PERFORMANCE SUGGESTIONS
-----------------------
I have no suggestions - things look okay.


So i'm a bit lost. Which screen is right ? The performance one that  
indicates the 4924 sec latency or the scheduling one that tells me the  
checks are made in time. What do you think of that ? How can the  
latency be so high when nagios needs to make only 1 or 2  
checks/seconds ? Is there anything wrong in my setup ?


Thanks in advance for any advice or info you could give.


Olivier Jan







-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list