[PATCH] Re: alternative scheduler

Andreas Ericsson ae at op5.se
Fri Dec 3 11:40:16 CET 2010


Sorry for the long delay. It seems I was half asleep when I scrolled by
this mail earlier.

On 12/01/2010 03:40 PM, Fredrik Thulin wrote:
> On Wed, 2010-12-01 at 15:14 +0100, Andreas Ericsson wrote:
> ...
>>> Host checks were still being scheduled, and every time a host check was
>>> found at the front of event_list_low, Nagios would log "We're not
>>> executing host checks right now, so we'll skip this event." and then
>>> sleep for sleep_time seconds (0.25 was my setting, based on (Ubuntu)
>>> defaults) (!!!).
>>
>>
>> This should only happen if you've set a check_interval for hosts but
>> have disabled them globally, either via nagios.cfg or via an external
>> command. It seems weird that we run usleep() instead of just issuing
>> a sched_yield() or something though, which would be a virtual noop
>> unless other processes are waiting to run.
> 
> Guilty of setting a check_interval for hosts, even on slave servers,
> yes.
> 
> IMNSHO, if that is an unsupported configuration in combination with
> execute_host_checks=0, Nagios should refuse to load the configuration.
> 

It isn't. It just uncovers the the issue you experienced.

>>> I made the attached minimalistic patch to not sleep if the next event in
>>> the event list is already due.
>>>
>>
>> Seems sensible, but I think it can be improved, such as issuing either
>> a sched_yield() or, if sched_yield() is not available, running usleep(10)
>> every 100 skipped items or so. That would avoid pinning the cpu but would
>> still be a lot faster than what we have today.
> 
> What is sched_yield? I can't find that function anywhere in the source
> code. Feel free to improve the patch - as I've previously said C isn't
> my game.
> 

sched_yield() causes the kernel to check through its scheduling queue and
see if there are other processes waiting to run. If there are, those other
processes will run. If not, the current process will continue running.

>>> This removed the total lack of performance in my installation, but
>>> service reaping is still killing me slowly on my virtual development
>>> server.
>>
>> How come?
> 
> I currently reap every 10 seconds, and crude empirical observations made
> by tailing the log file says that reaping takes 3-4 seconds on my
> virtual machine (<  1 second on the production server). This is *with*
> the following things on RAM disk :
> 
>    object_cache_file
>    precached_object_file
>    status_file
>    temp_file
>    temp_path
>    check_result_path
>    state_retention_file
>    debug_file
> 

ram disk doesn't mean anything on virtual servers, because it's quite
likely that the host os is still using a swap file to host that content.
In general, performance-testing anything in a virtual server is a bad
idea, since the IO performance is so utterly crap and one can never be
really sure that what appears to be stored in memory isn't stored on
disk by the host os.

> and with the tiniest C program that appends results to a file as
> ocsp_command.
> 

Use Nagios' own native perfdata writing instead and use a same-partition
"mv" command to move the perfdata file to the reaper spool directory.

> I'll try changing reaping interval to every 2 seconds as per your
> advice, but I guess it will still take 30-40% of the total time.
> 

On virtual machines, yes. On your physical server it's less than 10%.
How much less, one can only guess, but it should be very little if
you're using ramdisks.

>> ... Still though, reaping more frequently means the cache
>> would more often be hot and reaping will run a lot faster.
> 
> Which cache would be hotter by reaping more frequently do you mean? The
> files are on RAM disk already.
> 

I didn't know that. In that case, it won't matter more than a minuscule
amount.

>>> The scheduler really needs much more work (like sub-second precision for
>>> when to start checks - that gave me roughly 25% additional performance
>>> in my Erlang based scheduler),
>>
>> That's not possible. With subsecond precision the program has to do
>> more work, not less. You're looking at the wrong bottleneck here and
>> you most certainly botched the implementation the first time around if
>> adding subsecond precision made such a large improvement for you.
> 
> We should have a beer and talk about scheduling sometime, since we're
> both in Stockholm (?).
> 

I'm in gothenburg. We frequently do developer beer things at our office
here though, so if you happen to come by, we'll crack open a few :)

> My first scheduler ticked once per second and *BAM* started 30+ checks.
> 
> A lot of the times, a significant number of these checks were exactly
> the same check (but different target hosts), so my theory is they all
> requested the very same resources around the same millisecond. When I
> changed the scheduler to start one check every 50 ms instead, I saw that
> I could start around 25% more checks every second. Other theories are
> welcome, but that was my observation.
> 

The problem is the tick-time. I'm guessing you fired the checks and then
did sleep(1) (or whatever the erlang equivalent is), but that means you
lose a couple of milliseconds each second (the time it takes to fire up
the checks), which will inevitably cause you to drift in the scheduler.
All such sleep()-alike calls are implemented in the kernel with a TICK
precision that varies from system to system. Most systems have a 10 usec
tick-rate, so if you start sleeping at 1.94 seconds and sleep for one
second you'll end up at 2.94 instead of, as a scheduler would wish, at
2.0 when checks are actually scheduled.

>> Try removing check_interval and retry_interval from your hosts instead,
>> and set should_be_scheduled=0 in your retention file before restarting.
>> execute_host_checks is about actually running the checks, whereas you
>> want to skip even scheduling them.
> 
> I'll think about doing that, or just throwing hardware at the problem
> now that my Nagios check servers perform reasonably well.
> 

I'll see about adding something similar to your patch to the scheduler.
It's a good one in spirit, but the implementation left a little to be
desired.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231

Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev




More information about the Developers mailing list