Wish: Multiple instances of alerts on the same service/host

Marlo Bell BELLMR at ldschurch.org
Mon Mar 19 16:43:39 CET 2007


I didn't quite understand the disk example. But a simple and popular answer to the trap issue is a program called SNMPTT.
 
Have SNMPTT catch all the traps you send and log them to a mysql database. Then have a "SNMP Traps" service actively check the database for all new traps for the associated host.
 
I modified SNMPTT's DB just slightly--adding an "acknowledge" column and then wrote a simple php page which would allow a user to acknowledge the one/many traps for that particular host.
 
I'll leave SNMPTT for you to get set up, but I'll include my simple plugin and php page. Feel modify them to meet your needs/ version of SNMPTT. I recommend using the php page as the "action_url" for the trap service passing in the $HOSTNAME$ macro as a GET parameter. These are both pretty quick and dirty, so don't be too critical.
 
Good luck.
 
Marlo

>>> Ståle Askerød Johansen <s.a.johansen at usit.uio.no> 3/19/2007 5:51 AM >>>

(This may appear twice. I fumbled with my subscription confirmation)


Here at the University of Oslo we are currently running Nagios
alongside our current monitoring system in order to check if
Nagios suits our needs.

So far, we are very happy with most of what we see. However, we
also consider using Nagios (with some suitable www-interface) as
our primary alarm console. This means that we will want to feed lots
of passive checks into Nagios from several other systems.

Let me give you an example:

- we want to forward SNMP-traps to Nagios from the management cards of 
our Dell and HP servers.
- we setup our trap-receivers to submit this through NSCA.
- on the nagios server, we define the service "snmp trap" on all the 
relevant hosts. the service is volatile and not active.
- we test.
- the hardware sends for instance "Fan 2 not OK". Nagios receives this 
as a critical event. let's pretend the operator uses some time to fix this.
- in the mean time, the hardware on the same host sends for instance 
"battery needs replacement". Nagios receives this as a critical event, 
but the previous event if NO LONGER visible in the interface.

Some may argue that we need to make separate services for each type of 
trap we want to receive, but sheer numbers make this not very elegant.


We need a way to tell Nagios that "this service is of a special kind 
whose events should not replace each other as they are received". This 
will make it easier to use Nagios and a suitable web-gui as a central 
alarm receiver without adding thousands of new services.

The same problem also makes it difficult to make, for instance, a plugin 
that monitors all userdisks on a host and reports to a service 
"userdisks", since the events will overwrite each other.

Has anyone else thought of this? Is it difficult to implement? Are we 
wrong in assuming that this is impossible with the present Nagios? Have 
we misunderstood completely? Is it a stupid and childish idea? :-)

-- 
Ståle Johansen, sysadmin, University of Oslo, Norway.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV 
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net 
https://lists.sourceforge.net/lists/listinfo/nagios-devel

----------------------------------------------------------------------
NOTICE: This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070319/839df3b2/attachment.html>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070319/839df3b2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: check_traps.pl
Type: application/octet-stream
Size: 1307 bytes
Desc: not available
URL: <https://www.monitoring-lists.org/archive/developers/attachments/20070319/839df3b2/attachment.obj>
-------------- next part --------------
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
-------------- next part --------------
_______________________________________________
Nagios-devel mailing list
Nagios-devel at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-devel


More information about the Developers mailing list