High latencies problem.

Alessandro Ren alessandro.ren at opservices.com.br
Tue Feb 17 17:41:45 CET 2009


On 2/17/2009 1:32 PM, D. Emmanuel Feinsmith wrote:

     Answers bellow,
> Alessandro,
>
> 1.  what is the breakdown between passive and active checks? For
> passive checks, there are many ways to increase the # of services
> through bypassing the command pipe (which nsca writes to). With
> passive checks done in this way I've gone to 50,000 services with
> under 10 second latency.
>    
     All active checks, no passive.

> 2.  how many of those services are check_icmp or check_ping? If there
> is a good number of those, you can use fping to reduce the # of fork/
> exec's that nagios has to perform, which is a major area of resource
> utilization within the nagios server.
>    
     Less than 5% are ping checks and we use check_icmp for all those. 
Most checks are check_nrpe,.

> 3. Are you using a performance data handler or OCSP? If so, you might
> either find a way to get rid of these entirely, or be sure you are
> using file based performance handling at the very minimum.
>    
     I am using perfparse to write to mysql. Disabling it has no effect 
in the latency.

> The key to nagios scalability and latency reduction is to educe the #
> of fork/exec's to the smallest amount possible and keep away from the
> command pipe as much as you can if you are passive-check heavy. If you
> are using all active checks, then you can balance the load between
> active and passive checks and thereby gain some speed.
>    

     In my other nagios with just 2600 services, we see around 200 
nagios processes running in average, in the 11600 services system, the 
average is 30 processes, it seems that the event loop in lagging, is is 
not starting enough processes, thus raising the latency.

     Thank you Daniel.
> Daniel.
>
> On Feb 17, 2009, at 8:17 AM, Alessandro Ren wrote:
>
>    
>>     Hello,
>>
>> I have a nagios system running with 427 hosts and 11160 services and
>> since I reached 8000 services, I am having problems with the latency
>> beeing around 100s and 200s.
>>     use_large_installation_tweaks is enabled, max_concurrent_checks
>> have
>> been tested with 0 and higher values and I have tested this setup in
>> two
>> different HWs, a dual core with 4GB RAM 32 bits a a Dual Xeon Dual
>> core
>> 64bits with 8GB of RAM. We are using REdHat enterprise 5.
>>     Also reaper is already at 2s, host checks with cache horizon are
>> enabled with a max retry of 3, all services check every 5min.
>>     I have no service dependency set up.
>>     I've noticed that nagios is not spawning too many processes as
>> another nagios I have running which has far less servicexs and it
>> seems
>> that the event loop if lagging behing, in my debugs.
>>     Any ideas what could I do to fix that? Have I reached a limit in
>> nagios pooler code?
>>
>>     Tks.
>>
>> -- 
>> Alessandro Ren
>> http://www.opservices.com.br
>> alessandro.ren at opservices.com.br
>>
>> ------------------------------------------------------------------------------
>> Open Source Business Conference (OSBC), March 24-25, 2009, San
>> Francisco, CA
>> -OSBC tackles the biggest issue in open source: Open Sourcing the
>> Enterprise
>> -Strategies to boost innovation and cut costs with open source
>> participation
>> -Receive a $600 discount off the registration fee with the source
>> code: SFAD
>> http://p.sf.net/sfu/XcvMzF8H
>> _______________________________________________
>> Nagios-devel mailing list
>> Nagios-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>>      
>
>
> ------------------------------------------------------------------------------
> Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
> -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
> -Strategies to boost innovation and cut costs with open source participation
> -Receive a $600 discount off the registration fee with the source code: SFAD
> http://p.sf.net/sfu/XcvMzF8H
> _______________________________________________
> Nagios-devel mailing list
> Nagios-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-devel
>    

------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H




More information about the Developers mailing list