CPU leak on nagios-2.0b2

Andreas Ericsson ae at op5.se
Thu Feb 24 14:14:47 CET 2005


Dmitriy Kirhlarov wrote:
> Hi!
> 
> On Thu, Feb 24, 2005 at 12:29:17PM +0100, Andreas Ericsson wrote:
> 
>>>We are try use nagios-2.0b2 for active monitoring ~500 services every 2 
>>>min.
>>
>>Usually it isn't necessary to run checks every 2 mins, since this puts a 
>>considerable load on the system (500/120 = 4.1 checks always running 
>>simultaneously). See if you can put disk-checks stuff that doesn't 
>>usually change very rapidly on a less frequent schedule. 15-30 minutes 
>>should be enough. load and such (which uses average values anyway) can 
>>also be checked less frequently.
> 
> 
> Yes, I know. But it's our specific.
> This is only part of distributed scheme.
> And we are need minimal latency of monitoring system (now we have 5-min latency).
> 
> 
>>>It's PIII-730MHz with 512 Mb of memory.
>>>
>>>Summary CPU load on this machine ~30% (we are looking on OIDs:
>>>enterprises.ucdavis.systemStats.ssCpuRawSystem.0
>>>enterprises.ucdavis.systemStats.ssCpuRawUser.0
>>>) with SystemCpuLoad prevail.
>>>
>>>After 24 Hours summary CPU load on this machine ~85%.
>>>
>>
>>85% CPU load is equivalent (roughly) to 0.85 as reported by uptime. 
> 
> 
> No.
> load average: 2.70, 2.80, 2.80
> 

man uptime. I think your snmp implementation is somewhat broken or 
doesn't report what you think it does. uptime shows the number of 
processes which have instructions queued up for execution over an 
average period of time, meaning 1.0 = 100% system load.

Even so, 2.70 is not so bad.

> 
>>The simplest and cheapest way is to raise the default check interval and 
>>require less checks before a service enters hard state.
>>You can also make sure no un-necessary programs are running on the 
>>server (like X-windows or a heavily loaded web/database server).
>>
>>The most expensive and cumbersome way is to buy new hardware. Avoid this 
>>if it can be helped at all.
> 
> 
> If we have stable load ~80% -- no problem.
> But CPU load _increased_ from 30% when we started nagios to 80% during 24 hours approx.
> 

The fact that the load increases when you run applications is pretty 
obvious after all. Nagios is a bit of a resource hog since it has to run 
a lot of programs to do the actual checks. You really should try to run 
nagios on a dedicated server that doesn't really have a load until you 
start nagios on it.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click




More information about the Developers mailing list