How to reduce a very high latency number

Marc Powell marc at ena.com
Wed May 17 22:27:13 CEST 2006



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of Trask
> Sent: Wednesday, May 17, 2006 1:09 PM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] How to reduce a very high latency number
> 
> I am still butting up against very high latency issues with my Nagios
> setup.  I feel like I must be missing something obvious because it
> doesn't seem like I have so many services that the servers cannot keep
> up.
> 
> As can be seen from the data below, the server with the most service
> checks has the highest latency (usually in the neighborhood of 700
> seconds! -- this is pre-production).  Is my problem really this
> simple?  I have a feeling that is isn't just the number of checks, but
> I cannot figure out why my latency values are so terrible.
> 
> Overview of my setup:
> 
> There are 4 servers.  3 distributed servers (nag1, nag2, nag3) at 3
> distinct geological locations send all their check information via
> NSCA to a 4th, central server (nag4).  The connections between all of
> these servers are very high-bandwidth and are no where near saturated.
>  The only unclear spot to me is the effect that our hardware
> VPN/tunnels might have, however the worst performing server (nag2) is
> on the same LAN as the central server (nag4).
> 
> Nagios v2.2, latest plugins and NRPE/NSCA as of today.  I am running
> embedded perl with perlcache enabled.
> 
> 
> Number of hosts/services:
> nag1: 43/130
> nag2: 193/1743
> nag3: 78 / 780
> nag4: (central server - active host checks, passive srvc checks)
> 
> Performance Info:
> 
> nag1:
> Metric                            Min               Max
> Average
> Check Execution Time:  	0.00 sec        20.04 sec       0.024
sec
> Check Latency:	            0.00 sec          1.01 sec
0.011 sec
> Percent State Change:	 0.00 %           17.17 %         0.01%
> 
> nag2
> Check Execution Time:  	0.00 sec	929.13 sec	 1.246
sec
> Check Latency:	            0.00 sec	   1180.67 sec
560.462 sec
> Percent State Change:	 0.00%	        55.59%	           0.07%
> 
> nag3:
> Check Execution Time:  	0.00 sec	101.70 sec	 0.310
sec
> Check Latency:	            0.00 sec	    602.57 sec
46.023 sec
> Percent State Change:	 0.00%	         0.00%	            0.00%

My first reaction is to question why some checks are taking >15 minutes
to complete (check execution time) and why you are allowing them to go
that long. I only allow a maximum of 60 seconds for any service check to
execute --

(from nagios.cfg)
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5

Some comparable stats from my servers --

PIII 800/512MB 828 Service Checks -

Check Execution Time:  	0.13 sec	11.59 sec	7.984 sec
Check Latency:	0.76 sec	15.54 sec	6.583 sec
Percent State Change:	0.00%	6.25%	0.03%

All active checks, load hangs out around 2.

Another box, newer hardware, running nagios + cricket --

2x Dual Core AMD Opteron Processor 275, 2GB RAM, 1260 service checks --

Check Execution Time:  	0.04 sec	35.02 sec	6.675 sec
Check Latency:	0.01 sec	38.16 sec	6.692 sec
Percent State Change:	0.00%	9.47%	0.04%

All active checks, load hangs out between 1 and 2.

--
Marc 


-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list