Notifications not reaching me

Andreas Ericsson ae at op5.se
Tue Dec 7 03:18:35 CET 2004


Gary Lawrence Murphy wrote:
>>>>>>"A" == Andreas Ericsson <ae at op5.se> writes:
> 
> 
>     A> You've got notification_interval 0, which means it will only
>     A> send one notification. I'm surprised you get any at all for
>     A> your escalations where first_notification is higher than 1.
> 
> Probably because that value was overridden and thanks
> 
> This use of notification_interval=0 was ambiguous in my reading of the
> docs: I'd thought it would limit /notification/ to once _per_ _rule_,
> such that there'd be one notice at the early-warning level, one at the
> alarm threshold (I then had hourly once it escalated to SMS) -- I
> didn't consider that it would also limit the retry-tests and
> escalation.
> 
> So does this also mean that if the retry check-interval is 5 min but
> the notification interval is 60, and the escalation is set to the 3rd
> notification, that the notice will not escalate until the third hour?
> (rather than 15 min)
> 
> Thanks -- this clears up a lot.  
> 
> I had understood notification_interval as only a throttle on the
> number of notification messages as in "will not send email/SMS any
> more frequent than notification_interval minutes, but will continue to
> check for status change every retry_check_interval minutes."  It just
> seemed logical to me that we'd want to continue to poll the service at
> the prescribed rate for retry tests but not send emails every 2
> minutes.
> 
> So, then, is this correct? ...
> 
> 1) if the service check fails, nagios will check on retry_check_interval
> until it tests good or fails max_check
>

Right on so far.

> 2) after all max_check tests _fail_ nagios will throw one notice, then
> pause notification_interval before it checks again
> 
> 3) if it still fails, the notification count is increased and it waits
> notification_intervals between checks until the service comes back up.
> 

I think that when a service has been determined to be in a HARD error 
state, Nagios waits normal_check_interval before checking it again. 
Notifications are suppressed during that time, since it doesn't make 
much sense to just re-send that same notification. I believe this is 
done differently in Nagios 2 (if it isn't, it should be), where 
notifications are re-sent no matter when the service was last checked.

I might be wrong about that last one. I haven't looked at the 1.x code 
in a very long time.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Lead Developer


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list