Nagios heavy load check

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Wed Apr 30 07:22:11 CEST 2003


Dear Sir,

I am writing to thank you for your letter and say,

On Tue, Apr 29, 2003 at 12:56:41PM -0400, Patrick LeBoutillier wrote:
> Hi all,
> 
> Is it possible for Nagios to send notifications when it starts getting to
> much behind in it's
> checks? We had a previous monitoring tool that had that problem. It would
> get overwhelmed
> with tests (especially when many timeouts occured).
> 
> I guess I could use a cron to check the scheduling queue CGI, but maybe
> there is a better way
> (maybe when max_concurrent_checks=20 is reached to often or something like
> that).

If someone hasn't suggested this already, perhaps a cron scheduled check
of the Nagios performance page, checking for the average (and perhaps
the max) check latency.

http://<Your_Nag>/nagios/cgi-bin/extinfo.cgi?&type=4

You will have to deal with the HTML tables (A Perl check could use
HTML::Table or roll your own with HTML::Parser) but this could be 
simple-mindedly done by searching in the HTML for 'Check Latency:'.

> 
> Thanks,
>

However, when Nag is tuned on appropriately sized hardware, I have found 
it performs well and reliably.

The only cases of check latency I have seen (or heard about) involve

 . lots of hosts down (eg power, LAN or shared storage failure)

 . many many services (>= 1000)

 . high check frequencies

 . lossy network connections or flapping services necessitating check
   retries.

   For my employers site, Nag is checking 350 services at 5 minute 
   intervals or a target rate of 70 checks/minute. If this were higher 
   (say 
   100s/min, because either the check interval is small, the checks take  
   more than 5 minutes to complete or there are more
   services to be checked), then this may be a cause of latency.


   You should be able to simulate at least part of the load of checking 
   with a fake check that waits as long as your longest check and a 
   simple driver that forks your target check number and execs the fake 
   check (Nag does more than this of course, but this will give you a
   lower bound on your resources budget)

Yours sincerely.
-- 
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list