R: acknowledge

jeff vier boinger at tradingtechnologies.com
Thu May 11 20:56:31 CEST 2006


On Thu, 2006-05-11 at 18:21 +0200, Hugo van der Kooij wrote:
> On Thu, 11 May 2006, jeff vier wrote:
> 
> > On Thu, 2006-05-11 at 17:01 +0200, Marco Borsani wrote:
> > > Well...
> > >
> > > I try link you suggested me, but I do not understand how it can help me....
> >
> > You said "My target is to reduce the number of CRITICAL flags on the
> > Nagios Web Page."
> 
> Come to think of it. I find this a rather peculiar target. I would say
> solving the problem is your target and getting all green accross the board
> with Nagios is the way to show it.
> 
> Just getting everything green without solving the problem sounds rather
> shortsighted to me. Why bother to perform the checks? Just show some
> greenish static pictures and everyone is happy. I bet you can even reduce
> staff.

For us, the problem is that with over 6,000 services and 5 different
teams (Software, Networking, Infrastructure, Unix Administration, and
Internal Development) all working with the same monitoring system,
having the screens in the NOC showing problems that are already being
actively worked on would be horribly annoying.

For instance, the Network Team sees an alert about a router being down -
before the router goes to Hard Down, though, 4 services get to a Soft
Critical.  They are on the problem and dealing with it, so they Ack it.
The Software Team doesn't need to keep staring at the problem on the
projectors at the front of the NOC, so such filtering allows a much more
useful front screen.

Not to mention, over here, that router is very likely at a client site,
and they powered it off (either on accident or just without telling us)
and we CAN'T fix it ourselves.

And what about Downtime?  We have Windows servers that need updates and
whatnot - when they're in Downtime and go all-red, the NOC doesn't need
to see that crap.

And what about hardware failures on a primary server that gets
temporarily swapped with an on-site spare?  We don't need to see all-red
spare servers for a week before they're replaced.

Not to mention, our NOC is a "show piece" when the execs are touring
clients around to show them how we watch the services we provide them.
Having a bunch of Red/Yellow/Orange up there is no good, if we can help
it.

When we were <10 people (3 years ago) and worked in a room with no
windows, this wasn't necessary (only about 600 or 800 services then,
too).  Now we're over thirty (closer to forty) people, a
partially-glass-walled, tiered NOC and have the aforementioned 6,000+
services (and another thousand or so going in in the next month with
more clients coming on board).

So, it's all an issue of scale and what the needs of the individual team
(or teams) need and want.

Of course, it would just be easier if everything just stayed green,
though :)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
URL: <https://www.monitoring-lists.org/archive/users/attachments/20060511/ca08ee8d/attachment.sig>


More information about the Users mailing list