Acknowledgement Escalations

RijilV rijilv at riji.lv
Thu Jan 22 06:08:19 CET 2009


2009/1/21 Mathieu Gagné <mgagne at iweb.com>

> Hi,
>
> Here is the situation:
> Somebody acknowledges a problem and forget about it.
>
> How would you implement an acknowledgement escalation?
>
> Or how would you detect such situation where a host/service is
> down/critical for too long while being acknowledged?
>
> --
> Mathieu
>
>
Mmmm, there are a couple of technology things you could do for this, but the
root of this problem is people, not computers.  You need to work our a
process where people aren't ack'ing things just so they can fall back
asleep.  I personally suggest having nagios create a ticket with whatever
ticketing system you use (you use one right?!) so you can track that issue.
That and having a 24x7 NOC helps :)

Otherwise, write something that takes a look at the status file and find
services that are in a non-okay state but acknowledged and have been for
however long.  I wrote a simple nagios CFG parser that would be able to
handle it that's under the GPL at my former company for their Oracle
Monitoring:  https://code.bluegecko.net/wiki/Monocle.  There is another one
that will probably work somehwere on cpan.

I would probably write that program to un-acknowledge things as well as
alarming.  If it just alarmed, someone might acknowledge it and do nothing
about it. (since that's the problem you're having)   You can do the
un-acknowldeging though the nagios cmd file:

http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=116
http://www.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=117

You can get the location of the command file from the macro $COMMANDFILE$.

Cheers,

.r'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090121/e10735d4/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list