how to fix excessive latency

wwanghongrui wwanghongrui at cebbank.com
Tue Jun 29 03:57:47 CEST 2010


Thanks your reply. We are writing to mysql database by ndoutils.We don't use nsca. About external_command_buffer_slots, we don't set it up. 
status_update_interval =15 

I use vmstate to capture system performance,like below.Maybe the bottleneck is not at system.

procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0    160 239708 289248 6031924    0    0     1    29    0    0  2  3 94  1  0
 1  0    160 242168 289248 6031924    0    0     0     0  260 1023  0  6 94  0  0
 1  0    160 246912 289248 6031924    0    0     0   392  291 1044  0  6 93  1  0
 1  0    160 246696 289248 6031924    0    0     0   100  265 1056  0  6 93  0  0
 2  0    160 243604 289248 6035008    0    0  4668     0  598 1324  1  7 91  1  0
 1  0    160 245276 289248 6035008    0    0    32     0  265 1403  0  6 93  0  0
 1  0    160 245268 289248 6035008    0    0     0     0  253 1187  0  6 94  0  0
 1  1    160 245548 289248 6035008    0    0     0  4728  887 1759  0  6 88  5  0
 1  1    160 246288 289248 6036036    0    0     0  1740 1065 1103  1  6 87  6  0
 0  1    160 247368 289248 6036036    0    0     0  1720 1086 2252  1  3 90  6  0
 0  0    160 247492 289248 6036036    0    0     0   980  984  539  4  0 90  6  0
 0  0    160 247624 289248 6036036    0    0     0     0  254  330  0  0 100  0  0
 0  0    160 247624 289248 6036036    0    0     0  5420  622  342  0  0 97  3  0
 0  0    160 247844 289248 6036036    0    0     0     0  254  312  0  0 100  0  0
 0  0    160 247844 289248 6036036    0    0     0     0  254  317  0  0 100  0  0
 0  0    160 247984 289248 6036036    0    0     0     0  254  313  0  0 100  0  0
 0  0    160 247984 289248 6036036    0    0     0     0  254  315  0  0 100  0  0
 0  0    160 248260 289248 6036036    0    0     0   352  362  317  0  0 99  1  0
 0  0    160 248260 289248 6036036    0    0     0     0  306  303  0  0 100  0  0
 1  0    160 248876 289248 6036036    0    0     0   100  270  367  0  0 99  0  0
 5  0    160 233840 289248 6036036    0    0     0     0  341 1490  6  8 86  0  0
 5  0    160 187468 289248 6036036    0    0     0     4  866 2736  9 22 69  0  0
 4  1    160 171508 289248 6036036    0    0     0  5352  837 2205  3 20 76  1  0
procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 4  0    160 175172 289248 6036036    0    0     0   568  453 2091  1 15 83  0  0
 3  0    160 154108 289248 6036036    0    0     0     0  427 3456  1 20 79  0  0
 5  0    160 125684 289248 6036036    0    0     0     4  469 2620  1 19 80  0  0
 9  0    160 146712 289248 6036036    0    0     0     0  603 2272  4 26 70  0  0
 6  0    160 168804 289248 6036036    0    0     0     0  668 2784  9 27 64  0  0
 4  0    160 181032 289248 6036036    0    0     0  1164  736 2654  4 25 70  1  0
 1  0    160 210728 289248 6036036    0    0     0     0  465 2152  5 19 76  0  0
 1  0    160 211216 289248 6036036    0    0     0     0  294  837  0  6 94  0  0
 1  0    160 216644 289248 6036036    0    0     0     0  293  954  0  7 93  0  0
 1  0    160 227320 289248 6036036    0    0     0     0  285  943  0  8 92  0  0
 1  0    160 238864 289248 6036036    0    0     0   576  343 2308  1  8 91  1  0
 1  2    160 233660 289248 6039120    0    0  2252   100  393 1046  1  6 92  1  0
 1  0    160 239548 289248 6039120    0    0   984  3316  571 1055  1  6 92  1  0
 1  0    160 240084 289248 6039120    0    0     0     0  253  998  0  6 94  0  0
 1  0    160 239968 289248 6039120    0    0     0     0  253  990  0  6 93  0  0
 1  1    160 240388 289248 6039120    0    0     0  1956  781 1111  0  6 89  4  0
 1  1    160 240256 289248 6039120    0    0     0  1828 1088 1452  1  6 87  6  0
 1  2    160 239648 289248 6039120    0    0     0  1620 1038 1614  1  6 87  6  0
 1  1    160 240028 289248 6039120    0    0     0  1700 1065 1459  0  6 85  9  0
 1  1    160 239912 289248 6039120    0    0     0  2512 1211 1623  0  6 87  6  0
 1  1    160 240648 289248 6039120    0    0     4  2880 1380 1128  0  5 87  7  0
 1  0    160 241124 289248 6039120    0    0     0    84  499 1024  0  6 93  0  0
 1  0    160 241000 289248 6039120    0    0     0   296  287 1757  1  6 93  1  0
procs -----------memory---------- ---swap-- -----io---- -system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 3  0    160 241808 289248 6039120    0    0     0     0  253 1630  1  6 93  0  0
 1  0    160 241800 289248 6039120    0    0     0     0  253  977  0  6 94  0  0
 1  0    160 241880 289248 6039120    0    0     0     0  253  989  0  6 94  0  0
 3  0    160 218192 289248 6039120    0    0     0   100  350 1810  3 14 83  0  0
 4  0    160 181560 289248 6039120    0    0     0  5792  957 2948  6 21 72  1  0
 6  0    160 182036 289248 6040148    0    0     0     0  853 2947  7 22 70  0  0
 4  0    160 187860 289248 6040148    0    0     0     0  564 2748 12 25 64  0  0
 4  0    160 202880 289248 6040148    0    0     0     0  432 2336  5 22 73  0  0
 5  0    160 189956 289248 6040148    0    0     0   416  824 2762  7 24 69  1  0
 2  0    160 195912 289248 6041176    0    0    52  1224  789 2332  5 15 78  2  0
 1  0    160 205060 289248 6041176    0    0     0     8  343 1718  2  8 90  0  0
 1  0    160 205076 289248 6041176    0    0     0     0  320 1177  0  6 93  0  0
 1  0    160 213844 289248 6041176    0    0     0     0  315 1100  0  7 92  0  0
 1  0    160 226900 289248 6041176    0    0     0     0  305 1210  0  8 92  0  0
 2  0    160 227188 289248 6041176    0    0     0   956  556  901  0  4 92  3  0
 1  0    160 228924 289248 6041176    0    0     0     0  294 1034  1  6 93  0  0
 1  0    160 229740 289248 6041176    0    0     0     0  292 1235  1  6 93  0  0
 1  0    160 230228 289248 6041176    0    0     0     0  287 1696  1  6 93  0  0
 3  1    160 230456 289248 6041176    0    0     0   128  288 1307  1  6 93  0  0
 1  1    160 228756 289248 6042204    0    0  3052  4944  921 1673  5  7 84  4  0
 1  1    160 229004 289248 6042204    0    0     0  1676 1061 1122  1  6 87  6  0
 1  1    160 229004 289248 6042204    0    0     0  1672 1081 1093  0  6 87  6  0
 1  1    160 230788 289248 6042204    0    0     0  1856 1171 1198  1  6 87  6  0

Regards

HongRui Wang
Mail:wwanghongrui at cebbank.com
2010-06-29



发件人: shadih rahman
发送时间: 2010-06-29 00:57:24
收件人: wwanghongrui; Nagios Users List
抄送: 
主题: Re: [Nagios-users] how to fix excessive latency

There is something definitely not right here.  We have about 10000 checks and the performance is lot better.  Anyhow we are using the following values

check_result_reaper_frequency=10
max_check_result_reaper_time=20


You should enabled debug mode and check the debug logs.  Are you writing to any backend database?  Are you using nsca to transfer service information to remote location.  what is the value of your status_update_interval?  what is your external_command_buffer_slots?





2010/6/28 wwanghongrui <wwanghongrui at cebbank.com>

Hi,guys~

Our nagios server envrionment: Nagios3.2.0 + Suse10-sp2 x86_64 + 8 GB mem + 4 x ( Xeon(R) CPU  E7420  @ 2.13GHz )
We have 500+ active check hosts and 3k+ active check services.  I have adjust some perfomance parameters in nagios.cfg, like below:
use_large_installation_tweaks=1
child_processes_fork_twice=0
enable_environment_macros=0
check_result_reaper_frequency=5
max_check_result_reaper_time=30

But, The nagios performance is still bad, like below:

Services Actively Checked:Time FrameServices Checked
<= 1 minute:271 (9.4%)
<= 5 minutes:1749 (60.4%)
<= 15 minutes:2824 (97.4%)
<= 1 hour:2898 (100.0%)
Since program start:  2869 (99.0%)

MetricMin.Max.Average
Check Execution Time:  0.09 sec32.23 sec1.113 sec
Check Latency:1.12 sec212.59 sec116.329 sec
Percent State Change:0.00%23.88%0.05%


Hosts Acrively Checked:Time FrameHosts Checked
<= 1 minute:32 (5.5%)
<= 5 minutes:419 (71.5%)
<= 15 minutes:586 (100.0%)
<= 1 hour:586 (100.0%)
Since program start:  586 (100.0%)

MetricMin.Max.Average
Check Execution Time:  0.08 sec4.29 sec3.035 sec
Check Latency:0.00 sec135.25 sec116.420 sec
Percent State Change:0.00%11.32%0.09%




 How could I find which services check or hosts check cause this seriously check latency? 


Regards 

HongRui Wang
mail: wwanghongrui at cebbank.com
2010-06-28



------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null




-- 
Cordially,
Shadhin Rahman
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20100629/62a71ecb/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list