Passive checks greatly delaying active checks

Fred f1216 at yahoo.com
Wed Sep 28 16:12:55 CEST 2005


Andreas,

That is a good suggestion, I have previously found that it helps a bit, but
it doesn't solve the problem.  When this problem occurs all scheduling of
service checks appear to stop.  Even if you use the web interface to schedule
it immediately, they don't execute.  It is almost like the queue of what to
execute next is either way in the future or is corrupt (this is just
speculation).

Thanks.
-FredC

--- Andreas Ericsson <ae at op5.se> wrote:

> Ludwig Pummer wrote:
> > Hello folks,
> > 
> > I'm experimenting with a distributed monitoring + failover configuration
> > between 2 nagios servers, each actively monitoring its own group of
> > hosts unless the other nagios server fails.
> > 
> > Nagios server #1 is a dual Xeon 2.4GHz (hyperthreading off) w/ 1.5GB RAM
> > running RHES 3. Nagios server #2 is a dual Xeon 3.2GHz (hyperthreading
> > on) w/ 3.0GB RAM running RHES 3 in 64-bit mode.
> > 
> > Both are running Nagios 1.2. They are running idential Nagios
> > configurautions with the exception of active/passive services. My nagios
> > init script sends DISABLE_HOST_SVC_CHECKS,
> > DISABLE_HOST_SVC_NOTIFICATIONS, and DISABLE_HOST_NOTIFICATIONS commands
> > at nagios startup for those hosts which that particular nagios server is
> > not supposed to actively monitor. I've got 472 hosts and 1487 services
> > total. Server #1 has 686 active and 801 passive service checks. Server
> > #2 has 805 active and 682 passive service checks. Both machines have an
> > ocsp_command set up which will send_nsca to the other nagios server the
> > results of any active checks.
> > 
> > The issue I'm having is that when I have nsca running to receive passive
> > checks from the other host, active checks are delayed a lot (from under
> > 30 seconds without nsca to 15-25 minutes with nsca running). My
> > command_check_interval is set to -1. I have log_passive_service_checks
> > set to 1 for testing, so I can see the nsca results coming in. I don't
> > see why receiving passive checks is causing such large delays in my
> > active checks.
> > 
> 
> It's because the FIFO becomes a bottleneck if you're doing more than 
> just a few passive service checks. Try lowering the 
> (something)_reaper_frequency in nagios.cfg. It might fix it, or at least 
> help up the situation a bit.
> 
> -- 
> Andreas Ericsson                   andreas.ericsson at op5.se
> OP5 AB                             www.op5.se
> Lead Developer
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by:
> Power Architecture Resource Center: Free content, downloads, discussions,
> and more. http://solutions.newsforge.com/ibmarch.tmpl
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting
> any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 







-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list