Problem with OCP_daemon in distributes environment

Craig Stewart Craig.Stewart at corp.xplornet.com
Tue Aug 16 21:47:13 CEST 2011


Michel,

I just did the same thing for my set up and I didn't see this issue.
That being said, I don't *want* the central master to execute service
checks at all unless it's stale.

What may be happening is that the remote passive check may be getting
inserted while the system is waiting to execute the next check.  This is
probably resetting the clock as it were and the count down starts over.

For example:

- NOW is an arbitrary point in time.
- Nagios schedules the check to be executed at NOW + 5 min. (recheck
interval)
- The passive check comes in at NOW + 3 min.  Nagios resets the clock to
NOW + 3 min + check interval.

If the remote is submitting checks at a frequency less than the
central's recheck interval, I can see this happening.  The clock never
runs out, unless the remote system doesn't submit a check.

A couple things to check are the check intervals on both the central and
the probe, and if you can tolerate the  hit, shut down the probe and see
if the central server starts executing checks on it's own.

I may be out in left field as well.

Cheers!

Craig
--
Craig Stewart
Systems Integration Analyst
Craig.Stewart at corp.xplornet.com
Xplornet - Broadband, Everywhere

On 08/16/2011 04:22 PM, michel.vdv at wxs.nl wrote:
> Dear readers,
>  
> I have a strange problem related to the use of OCP_daemon.
> I've implemented this today on a "remote" nagios machine responsible for
> monitoring our LAN hosts.
> Until now all messages and performance data was sent from that machine
> to our Central Nagios machine via obsess_over_hosts and
> obsess_over_services.
> But because a lot of services on the remote host combined with relative
> short check_interval periods caused high service and host check
> latencies i've started looking for an alternative and read about OCP_daemon.
> I followed the install instructions and sending data via OCP_daemon
> works fine and very fast, also the remote nagios machine's latencies
> stay low.
> However, the central server stays processing all passive service and
> host check results (also from other send_nsca based hosts) but no longer
> executes it's own ACTIVE checks.
> Is soon as i stop nagios on the remote monitor and restart nagios on the
> central server it starts executing ACTIVE checks again.
> The load on both servers remained about the same since OCP_daemon and
> the only thing i noticed is that the number of buffers/slots used for
> the external command file (nagios.cmd) on the central server
> reaches rather higher values than before but no more than 30 - 40% of
> the available 4096 slots.
>  
> Please advice me.
>  
> Michel
>  
> 
> -- 
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean.

------------------------------------------------------------------------------
Get a FREE DOWNLOAD! and learn more about uberSVN rich system, 
user administration capabilities and model configuration. Take 
the hassle out of deploying and managing Subversion and the 
tools developers use with it. http://p.sf.net/sfu/wandisco-d2d-2
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list