Antwort: Recovery not getting sent duringdowntime?

srunschke at abit.de srunschke at abit.de
Thu Aug 3 01:07:33 CEST 2006


nagios-users-bounces at lists.sourceforge.net schrieb am 02.08.2006 18:36:18:

> I think you might want to rethink your process so that it matches the
> paradigm of the tool you are using. It already can do what you want if
> you just work with the way it was designed, rather than forcing a
> process that breaks the paradigm.

I think you might want to re-read my mail, it seems you did
not fully understand what I meant ;)

> >Service goes critical
> This is not a scheduled event. It is unscheduled.

I never said anything else.
I never said I scheduled a downtime for the SERVICE.
I said I scheduled a downtime for the HOST, because I was
forced to reboot the HOST to fix the SERVICE problem.
I didn't have the possibility of acknowledging the service,
fix it and be happy. Sometimes certain services of some
retarded OS tend to kastrate themselves and only a reboot
can fix it. If I do not schedule a HOST downtime, then
SMSs get dispatched for the HOST being down and going up
and then for the service recovering. Not exactly the
behaviour I'd like to see.

> Since the unscheduled event has already been acknowledged and everyone
> who might want to jump in to help already knows it is being handled,
> there is no need to schedule a downtime for an unscheduled event. Just
> acknowledge it. 
> 
> >Reboot
> >Host/Service OK
> Recovery Note goes out to everyone, including CTO. Problem Solved.
> Everyone knows what is happening. Your performance evaluation gets a
> boost for being the one to solve the problem.

Uhm and what about the 2 SMS going out to everyone stating the
host going down and up again? That is neither expected (by the rest
of the admins for example), nor wanted behaviour.

I DO know what the documentation says regarding scheduled downtimes.
I DO know it says it suppresses all alerts.
Yet I still say it should not suppress recoveries for critical/warning/
unknown which happened before a schedule downtime. Somewhere else in the
documentation (or was it by Ethan somewhere else?) it states that for
every alert sent, there will be a recovery if it goes OK again.
And alas, Ethan seems to agree with me, so I can't be that wrong, eh?

regards
        Sascha

--
Sascha Runschke
Netzwerk Administration
IT-Services

ABIT AG
Robert-Bosch-Str. 1
40668 Meerbusch

Tel.:+49 (0) 2150.9153.226
Mobil:+49 (0) 173.5419665
mailto:SRunschke at abit.de

http://www.abit.net
http://www.abit-epos.net
---------------------------------
Sicherheitshinweis zur E-Mail Kommunikation /
  Security note regarding email communication:
http://www.abit.net/sicherheitshinweis.html

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys -- and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list