a question on hierarchical recovery notifications

Eric Young ericryoung at yahoo.com
Mon Oct 21 13:44:12 CEST 2002


I haven't been able to figure this one out yet. 
Hopefully you can help.  I'm considering putting
Nagios on our network but just ran into a problem on
Friday.  I 'faked' a major network failure by
essentially turning on ipchains to block all icmp from
my Nagios host.  I was testing about 150 hosts with 1
router as the parent of all.  I set notifications on
the 'children' to only 'd,r' (after getting many
notifications for 'unreachable' on my first attempt) 
When I turned on my new ipchains rule, the network
went down as expected and I only received notification
for the router.  Woohoo!

Now, here's the problem.  I left it like that for a
while and when I brought it back up (ie: simulating
the routers return to life), I got not only an up
notification for the router but quite a few 'recovery'
pages for the child nodes.  

So, in the case of major failure like this, I'd rather
not get pages for things that were never really down
(ie: they were just unreachable) but I don't know of a
way to set it so that if a node was only unreachable
and had a parent that was down, that I don't get pages
for that node (I guess unless it had been checked
again for the correct numbers of times).

Any suggestions?  Am I missing something?

__________________________________________________
Do you Yahoo!?
Y! Web Hosting - Let the expert host your web site
http://webhosting.yahoo.com/


-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf




More information about the Users mailing list