[PATCH] Re: alternative scheduler

Fredrik Thulin ft at it.su.se
Fri Dec 3 10:06:22 CET 2010


On Wed, 2010-12-01 at 15:40 +0100, Fredrik Thulin wrote:
> On Wed, 2010-12-01 at 15:14 +0100, Andreas Ericsson wrote:
> ...
> > > Host checks were still being scheduled, and every time a host check was
> > > found at the front of event_list_low, Nagios would log "We're not
> > > executing host checks right now, so we'll skip this event." and then
> > > sleep for sleep_time seconds (0.25 was my setting, based on (Ubuntu)
> > > defaults) (!!!).
> >  
> > 
> > This should only happen if you've set a check_interval for hosts but
> > have disabled them globally, either via nagios.cfg or via an external
> > command. It seems weird that we run usleep() instead of just issuing
> > a sched_yield() or something though, which would be a virtual noop
> > unless other processes are waiting to run.
> 
> Guilty of setting a check_interval for hosts, even on slave servers,
> yes.

Mea culpa. This sounded so plausible that I confessed right away, but
upon actually looking at my host template (all hosts use this), I don't
see what makes Nagios schedule host checks. This is what I was running
at the time (I've since tried to tune the reaping pass by disabling flap
detection, perf_data, event_handler and notifications on the check slave
servers (without any dramatical improvement)) :

define host {
 name			SU-generic-host
 notifications_enabled	1
 event_handler_enabled	1
 flap_detection_enabled	1
 process_perf_data      1
 retain_status_information    1
 retain_nonstatus_information 1

 max_check_attempts     10
 notification_interval	1
 notification_period	24x7
 notification_options	d,u,r

 register		0
}

> > > I made the attached minimalistic patch to not sleep if the next event in
> > > the event list is already due.
> > > 
> > 
> > Seems sensible, but I think it can be improved, such as issuing either
> > a sched_yield() or, if sched_yield() is not available, running usleep(10)
> > every 100 skipped items or so. That would avoid pinning the cpu but would
> > still be a lot faster than what we have today.
> 
> What is sched_yield? I can't find that function anywhere in the source
> code. Feel free to improve the patch - as I've previously said C isn't
> my game.

Since you haven't responded or elaborated on your enhancement
suggestion, how about applying the patch I sent until someone works up
the incentive to improve it further?

> I'll try changing reaping interval to every 2 seconds as per your
> advice, but I guess it will still take 30-40% of the total time. 

Tried this. When reaping every 2 seconds, each pass takes ~0.7 seconds
and no real improvement in check latency can be observed.

/Fredrik



------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev




More information about the Developers mailing list