[PATCH] notifications: Fix first_notification_delay

Andreas Ericsson ae at op5.se
Thu Dec 13 10:36:24 CET 2012


On 12/12/2012 08:10 PM, Jochen Bern wrote:
> On 11.12.2012 22:56, Jim Avery wrote:
>> We want to send an SMS notification if the UPS goes on to battery, but only
>> if it has been on battery for more than, say, five minutes.  I had hoped
>> that first_notification_delay would give me that possibility.  Obviously as
>> this is a passive check [...]
>>
>> Please forgive me that I don't understand the programmatical issues well
>> enough to see if any of the proposed solutions so far will fit this use
>> case.
> 
> I'ld say that our discussion is perpendicular to the issue you raise -
> or, in other words, that first_notification_delay is unlikely to
> suddenly work the way you want afterwards.
> 

It's actually not all that far fetched. Since the event queue is now so
fast, it would make sense to use it to schedule notifications as well.
Not because we'd gain a lot of CPU time on it (quite the reverse, but
we'd lose only a small amount), but because it would give us a single
flow of tasks that go to the workers.

Right now, notifications always happen while we're reaping checks from
our workers, which means we're always working in duplex mode with the
sockets, but it's possible that one reaping operation will reset or at
least negate the need to notify for a particular service (if, fe, the
check of the host where the service resides is pending to be reaped
from a different worker, but we happen to see the service check first).

Scheduling the notification to happen (instead of firing it directly)
means we can ensure we always reap all pending checks before hitting
the notification event. That would be a good thing. It's still possible
that a hostcheck or a "resetting" check is in-flight when we notify,
but we reduce the chance a bit. That, and the one-flow-to-workers thing
might be worth making the move to scheduled notifications (although
most of them would be scheduled to run "now", there would still be a
microscopic delay). If we do make that move, it would be absolutely
trivial to make the first_notification_delay work exactly like Jim
wants, and like most people who haven't read about it think it should
work, based on its name.

I think that's one of those changes that'll have to go into 4.1 though.
The 4.0 beta is just around the corner, and introducing any major
architectural changes right before it hits users would be a very bad
thing indeed.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
LogMeIn Rescue: Anywhere, Anytime Remote support for IT. Free Trial
Remotely access PCs and mobile devices and provide instant support
Improve your efficiency, and focus on delivering more value-add services
Discover what IT Professionals Know. Rescue delivers
http://p.sf.net/sfu/logmein_12329d2d




More information about the Developers mailing list