How to reduce Host Check rate ?

Greg Vickers g.vickers at qut.edu.au
Fri Aug 5 01:57:16 CEST 2005


Martin,

The key here is how you have set up your checks. It looks like your 
service and host checks are set up to alert you too quickly for one of 
these hosts that has outages.

The behavior you observe is a central tenet of Nagios, that as soon as a 
service enters a non-OK state Nagios will check that host to see if that 
host is alive. (If that host is not alive, there's no point in alerting 
for the service, is there?)

If you have frequent outages on your hosts then I suggest several methods:

1) Since you are using a ping service check, suppress any host 
notifications for those hosts and tune your service check so that no 
service alerts are sent out for x minutes (i.e. three minutes if your 
outages are two minutes or less)

2) Maintain your alerting regime, implement flapping (RTFM on how to set 
this up - trickiest of these methods IMHO)

3) Maintain your current setup and use escalations to suppress 
notifications.

HTH,
Greg

Martin Haller wrote:
> Hi,
> 
> Nagios produces some noisy host alerts we want to reduce. The
> situation in detail:
> 
> If an host/router goes down for 2 minutes, for example, the PING
> service check will notice this and the PING state will go to soft
> CRITICAL.
> 
> The host check is fired n times (max_check_attempts in the host
> profile) - but - with no delay such as the service checks have one
> (retry_check_interval).
> 
> The Problem is that the host state changes to hard critical in a few
> seconds and email/sms ist sent out. The result is a huge crowd of
> services arpearing for just one or two minutes in state hard
> CRITICAL.
> 
> The nagios version we use is 2.0b3.
> 
> My question is: Did I just not find the delay parameter for tuning the
> host check retry rate ? How did You tune Your setup to avoid this ?
> 

-- 
Greg Vickers
Project Manager, IT Security
Information Technology Services
Queensland University of Technology
L12, 126 Margaret St, Brisbane

Phone: (07) 3864 9536
Email: g.vickers at qut.edu.au
IT Security web site: http://www.its.qut.edu.au/itsecurity/

CRICOS No. 00213J


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list