Nagios and Gearman - huge environment performance problem

Sven Nierlein Sven.Nierlein at consol.de
Mon Aug 22 17:51:54 CEST 2011


Hi Paul,

On 22.08.2011 17:02, Paul M. Dubuc wrote:
> We use check_periods with a time period that reflects our regularly scheduled
> downtimes.  As the downtime approaches, Nagios schedules all checks on the
> same time that the downtime ends.


> notifications are inhibited.)  Many of our checks fail in this case by timing
> out and they use relatively scarce (shared) and resource intensive processes
> (web browser sessions run under SeleniumRC).

Both issues can be addressed with mod-gearman. As mod-gearman uses worker to process
the checks, you can exactly configure how many checks you want to run at a time. When
there are more jobs than worker, the jobs will be queued and delayed (instead of skipped).
Mod-Gearman will start new worker up to a configurable maximum of workers. This balances
the amount of concurrent checks a little bit.

You can even totally serialize checks for specific servicegroups. Thats what we do with
our selenium checks. We have two mod-gearman worker on two hosts with max-worker=1, so there
will only be one selenium check per host at a time.

  Sven

------------------------------------------------------------------------------
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev




More information about the Developers mailing list