Nagios performance and delayed checks

kyle kyle at caosdigital.com
Thu Sep 6 11:43:50 CEST 2007


Hi folks,

I'm having some performance issues with a rather big Nagios 2.4 deployment (430
servers, 3000 service checks) .

I'm having now a 600-900 secs avg latency for checks - checks distribution
is more or less like this :

 450 check_icmp 	(scheduled every minute)
 500 check_ifoperstatus (scheduled very 5 minutes)
 800 check_nt 		(scheduled every 5-10 minutes)
 200 check_http 	(scheduled every 5 minutes)
1000 check_by_ssh 	(scheduled every 5-10 minutes)

1730 Performance graphs generated by PNP 
 
I've already applied some performance tips specified in the nagios faq
(aggregated status updates, max_concurrent_checks=0, checked hardware
config, etc)

Since load average for this server is always below 1, is there any way to
force more concurrent checks per second? (btw, I've replicated the same
config in a similar server with a Nagios 2.9 setup and ended with similar
results as well)

Thanks in advance :-)


---

Output of nagiostat:

Active Service Latency:               590.886 / 671.695 / 635.054 %
Active Service Execution Time:        0.124 / 19.130 / 0.483 sec
Active Service State Change:          0.000 / 49.930 / 0.238 %
Active Services Last 1/5/15/60 min:   61 / 673 / 2150 / 2929



Output of nagios -s nagios.cfg : 

HOST SCHEDULING INFORMATION
---------------------------
Total hosts:                     424
Total scheduled hosts:           0
Host inter-check delay method:   SMART
Average host check interval:     0.00 sec
Host inter-check delay:          0.00 sec
Max host check spread:           30 min
First scheduled check:           N/A
Last scheduled check:            N/A


SERVICE SCHEDULING INFORMATION
-------------------------------
Total services:                     2929
Total scheduled services:           2929
Service inter-check delay method:   SMART
Average service check interval:     433.36 sec
Inter-check delay:                  0.15 sec
Interleave factor method:           SMART
Average services per host:          6.91
Service interleave factor:          7
Max service check spread:           30 min
First scheduled check:              Thu Sep  6 11:18:31 2007
Last scheduled check:               Thu Sep  6 11:25:45 2007





-- 
Windows macht frei!


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list