Check_load

Sandeep Narasimha Murthy sandeep-n-murthy at telecom.pt
Wed May 10 16:07:40 CEST 2006


 

Thanks Derek for your explanation.

 

I still have some doubts and appreciate some clarification:

 

1.The server in question has 2 CPUs. When Nagios invokes check_load,
does it monitor both the CPUs and reports the combined load average ?

 

2. What are the CRITICAL, WARNING and OK thresholds for Load Average ? I
mean when does it launch a Critical state ?

 

Thnx again,

 

Sg

 

________________________________

From: nagios-users-admin at lists.sourceforge.net
[mailto:nagios-users-admin at lists.sourceforge.net] On Behalf Of Derek J.
Balling
Sent: quarta-feira, 10 de Maio de 2006 11:52
To: nagios-users at lists.sourceforge.net
Subject: Re: [Nagios-users] Check_load

 

 

 

On May 10, 2006, at 6:42 AM, Sandeep Narasimha Murthy wrote:

 





Can anyone please provide a brief idea on how check_load works and on
the data it provides. The data provided is in the format load average:
1.62, 1.67, 1.74, which doesn't make a lot of sense. I googled on this
and ended up getting even more confused.

 

"load average" is the number of processes (average) in the wait state
over a given period of time. For most UNIX machines, the periods are
usually 1, 5 and 15 minutes. (e.g., in your example, in the past minute
an average of 1.62 machines were waiting for a CPU, in the past five
minutes, 1.67, and in the past 15 minutes, 1.74).

 

In the example you give, it would appear your machine is catching up, as
the short-term averages are lower than the long-term averages, but it
all depends on your applications that are running, what cycles they
demand CPU on, etc., etc.





The reason behind my question is that we are also using CACTI for
monitoring the systems and it shows a 100% CPU load since last night
while Nagios didn't raise any alarm at all. I have to find out whether
it was a false alarm or there is a problem with my Nagios setup ..

It's confusing to refer to "100% CPU load" since load isn't measured in
percentage (at least not in this case, not usually). Referring to it as
100% CPU usage, or utilization, might be better.

 

For example, it's possible to have a CPU load of "0" and usage of 100%.
How? If I've only got one process making use of CPU cycles, and it's
getting 100% of the CPU. If there's nothing else waiting for CPU, the
"load" will not be high, even though the usage (from the single running
process) will be.

 

In other words, CPU usage isn't necessarily as important as load is,
although it certainly can be a factor in determining the cause of Load
(because processes which are waiting on Disk I/O, or swap, also
contribute to load).

 

Hope this helps.

 

Cheers,

D

 

 

--

 

Derek J. Balling

Systems Administrator

Vassar College

124 Raymond Ave

Box 13 - Computer Center 217

Poughkeepsie, NY 12604

(845) 437-7231





 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20060510/b21909f3/attachment.html>


More information about the Users mailing list