Notificatiosn - best common practice

Jonathan Dill jonathan at nerds.net
Thu May 31 23:06:45 CEST 2007


Jim Avery wrote:
> During working hours, most notifications go to the service desk.  It's
> up to them to make sure they get routed to the right people.
>
> Outside normal hours, only issues which can't wait until the morning
> get notified to the on-call team.
>
> That's just my take on it :-)
>   
Also in case no one has mentioned it, when you first set up Nagios, you 
want it to be a little less aggressive about alerts until you get a 
handle on false alarms and most of your parent / child dependencies are 
ironed out.  Start with e-mail alerts, add text messaging later.  You 
don't want to be getting loads of text messages on your cell phone at 
3-4am because some huge backup job is hogging the network and 
temporarily caused some pings to time out, for example.

Personally, to start out, I set up a separate e-mail account and set up 
alerts for everything that I could think of to go to that one account 
(if I had had an IP-aware toaster or refrigerator, I would have put that 
in there as well) this also served to "exercise" the system.  After a 
week or so of not getting false positives from some alert, I moved it 
over to notifying "real" accounts and added escalation, then a bit after 
that, added text messaging.  Some of the more frivolous tests I just 
took out or disabled.  Some tests had to be "relaxed" a bit, for example 
longer ping timeout for a remote site on a slow DSL connection that gets 
flakey sometimes.  For the text messaging, I set up the time window that 
is appropriate for the need i.e. only very critical stuff is going to 
wake me up in the middle of the night, a lot of other things I can find 
out about first thing in the morning when I check my e-mail.

One guy I ended up removing from escalations later because he complained 
that it was "too much e-mail" :rolleyes:  No matter how much advice you 
get, the process is still going to be "ad hoc" to some extent--you want 
Nagios to be aggressive enough to be useful, but not so aggressive that 
it is so annoying as to be counter-productive, but what those limits are 
is really subjective.  It's kind of an in-joke in our group any time 
someone gets a text message it's the knowing look and, "Thank you, 
Nagios..."  But it sure impresses the heck out of our customers when we 
call them to tell them that there is a problem and we are working on it 
and they haven't even noticed the problem yet, they just think we are 
psychic.

Jonathan

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list