Subservice Definations

Sam Stickland sam_ml at spacething.org
Tue Dec 9 11:09:13 CET 2003


I had considered this in the past, but I've also noticed that when equipment
starts to fail you get weird and exotic SNMP traps that are barely mentioned
in any documention. And these are the sorts of ones I need to be looking out
for. If I handpicked which ones to listen for I'd undoubtably miss a few.

----- Original Message -----
From: "David Parrish" <david at dparrish.com>
To: "Sam Stickland" <sam_ml at spacething.org>
Cc: <nagios-devel at lists.sourceforge.net>
Sent: Monday, December 08, 2003 11:52 PM
Subject: Re: [Nagios-devel] Subservice Definations

On Fri, Dec 05, 2003 at 03:12:06PM -0000, Sam Stickland wrote:
> I'm after a slightly better solution for SNMP trap monitoring.
>
> SNMP traps are submitted as passive service checks. All the examples show
> the a single service defination for the traps - this has the unfortunate
> effect of the next SNMP trap overwriting the previous. I'd like to have
> different services definations for different SNMP traps on our networking
> equipment, but that would result in a LOT of service definations.
>
> It would be nice if there was a sub-service id that could be submitted
with
> passive checks, that effectively divides a single service into multiple
> services on demand, without the need for 1000's of service definations.
>
> Is there a solution to this that I've missed, or is it going to require
> changes to Nagios? I'm a fair programmer so I'm happy to make these
changes
> myself, but others might not consider this a good solution to this
problem.

I have a similar problem here, with rougly 1000 hosts and almost 10000
services.

I have a perl daemon running which recieves traps and inserts them as
passive check results.

For those reading at home, I didn't want to use the net-snmp trapd and make
it call a perl process, because it's SLOW doing a fork() & exec() for every
single trap, when I receive about 5 every second. I get lots because my
systems send status updates as well as faults.

The way I deal with it is by creating service definitions for critical
stuff, and ignoring almost everything else.

It's amazing the amount of traps that Cisco devices, for example send which
we really don't need to deal with. This has enabled me to receive alerts
for stuff I need, but still at only 10 services per host average.

It's probably not ideal, but it does what we need.

--
Regards,
David Parrish
0410 586 121




-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive?  Does it
help you create better code?  SHARE THE LOVE, and help us help
YOU!  Click Here: http://sourceforge.net/donate/




More information about the Developers mailing list