Is it possible to confirm an alarm condition prior to generating an alert?

don mccallum dmccallum at nemsys.com
Mon Jan 10 19:41:35 CET 2005


Ah, I read those, but I completely misunderstood... I thought it max
check attempts was how many times it would try when a service was
down.... Not how many prior to sending an alert.  That helps
tremendously.  Thanks Eric.

 

-Don

 

________________________________

From: Eric Loyd [mailto:loyd at cyber.kodak.com] 
Sent: Monday, January 10, 2005 12:53 PM
To: don mccallum; Nagios Users List
Subject: Re: [Nagios-users] Is it possible to confirm an alarm condition
prior to generating an alert?
Importance: Low

 

You need to change the max_check_attempts to be something other than 1
for these services.  This lets them check again (as defined by the
retry_check_interval) a few times before notifying.  You may also want
to set the notification_interval to be something higher than 10 minutes,
which means "only notify once every ten minutes (or whatever you set it
to), regardless of how many times it has failed in that time."

In other words, the answers to your questions on this topic are in the
nagios documentation on notifications and configuration.

At 12:48 PM 1/10/2005, don mccallum wrote:



 
Hello everybody.
I am running Nagios 1.2b and 2.01 at two separate locations. Both on
Redhat 7.3
The issue I am running into is a large number of false alerts, which due
to the nature of what we are monitoring.  (In reality they are real
alerts but they immediately recover - Nagios is working to well!) 
We monitor our customers servers who are widely distributed on the net.
Most are on DSL, cable or wireless connections.  The customers we have
on non-dedicated lines are generating real errors whenever they have to
reboot a cable modem etc, but they always recover within a few minutes.
Our alerts typically read "10:48 Host1 is down" then immediately "10:48
Service xxx is down" (repeat for all services) then followed by "10:48
Host1 is up" followed by "10:48 Service xxx has recovered" (repeat
again)
Therefore, my question is on the host/service checks can I have it
recheck a host/service when a failure occurs after 5 or 10 seconds prior
to generating the first alert?  Perhaps configuring it to only send an
email after the 2nd error condition is seen.  Simply double checking
prior to generating the alert would be helpful to us.
 
Thanks all,
-Don McCallum
 mccallum at nemsys.com 
 419-243-3603x113
 Nemsys LLC 

 

________________________________

Message transport security by GatewayDefender
<http://www.gatewaydefender.com> 
12:53:31 PM ET - 1/10/2005


--
Eric Loyd
Drive defensively.  Buy a tank. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20050110/d8b91736/attachment.html>


More information about the Users mailing list