Notificatiosn - best common practice
Jonathan Dill
jonathan at nerds.net
Thu May 31 23:06:45 CEST 2007
Jim Avery wrote:
> During working hours, most notifications go to the service desk. It's
> up to them to make sure they get routed to the right people.
>
> Outside normal hours, only issues which can't wait until the morning
> get notified to the on-call team.
>
> That's just my take on it :-)
>
Also in case no one has mentioned it, when you first set up Nagios, you
want it to be a little less aggressive about alerts until you get a
handle on false alarms and most of your parent / child dependencies are
ironed out. Start with e-mail alerts, add text messaging later. You
don't want to be getting loads of text messages on your cell phone at
3-4am because some huge backup job is hogging the network and
temporarily caused some pings to time out, for example.
Personally, to start out, I set up a separate e-mail account and set up
alerts for everything that I could think of to go to that one account
(if I had had an IP-aware toaster or refrigerator, I would have put that
in there as well) this also served to "exercise" the system. After a
week or so of not getting false positives from some alert, I moved it
over to notifying "real" accounts and added escalation, then a bit after
that, added text messaging. Some of the more frivolous tests I just
took out or disabled. Some tests had to be "relaxed" a bit, for example
longer ping timeout for a remote site on a slow DSL connection that gets
flakey sometimes. For the text messaging, I set up the time window that
is appropriate for the need i.e. only very critical stuff is going to
wake me up in the middle of the night, a lot of other things I can find
out about first thing in the morning when I check my e-mail.
One guy I ended up removing from escalations later because he complained
that it was "too much e-mail" :rolleyes: No matter how much advice you
get, the process is still going to be "ad hoc" to some extent--you want
Nagios to be aggressive enough to be useful, but not so aggressive that
it is so annoying as to be counter-productive, but what those limits are
is really subjective. It's kind of an in-joke in our group any time
someone gets a text message it's the knowing look and, "Thank you,
Nagios..." But it sure impresses the heck out of our customers when we
call them to tell them that there is a problem and we are working on it
and they haven't even noticed the problem yet, they just think we are
psychic.
Jonathan
-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue.
::: Messages without supporting info will risk being sent to /dev/null
More information about the Users
mailing list