Distributed monitoring: central collector doesn't seem to be able to run active checks

C. Bensend benny at bennyvision.com
Wed Aug 28 13:48:09 CEST 2013


>    I'm continuing to iron out the wrinkles with 3.5.1 and distributed
> monitoring.  I'm using mod_gearman to submit and receive events from
> two distributed pollers.
>
>    Every now and again, I'll get something similar in the log on the
> centralized collecting machine:
>
> CRITICAL: Return code of 127 is out of bounds. Make sure the plugin
> youre trying to run actually exists. (worker: collector.domain.org)
>
>    To me, that suggests that the collector system didn't get a result
> for a host or service in a timely manner from one of the polling
> systems, and so it attempted to run an active check itself.  However,
> it doesn't seem to be able to, and I don't know why.
>
>    The collector has the same value for $USER1$, and it has the same
> set of plugins installed on it:
>
> On the collector:
>
> grep USER1 etc/resource.cfg
> $USER1$=/usr/local/nagios/libexec
>
> On the two pollers:
>
> $USER1$=/usr/local/nagios/libexec
> $USER1$=/usr/local/nagios/libexec
>
>    The plugins are installed in identical locations on all three systems,
> that's enforced via Puppet.  The 'nagios' user can find and run them on
> the collector:
>
> /usr/local/nagios/libexec/check_nrpe -H 127.0.0.1
> NRPE v2.13
>
>    Now, because this is a distributed setup, the collector system is
> not configured to run active checks:
>
> grep ^execute etc/nagios.cfg
> execute_service_checks=0
> execute_host_checks=0
>
>    ... but *obviously* it's trying to.  Is it failing because it's
> configured to not run them?  If that's the case, the error message is
> not accurate and should be corrected.  If that's *not* the case, why
> can't my collector server run an active check when it believes it needs
> to?
>
>    I use NConf to generate my configurations, if that matters.  There are
> a *lot* of hosts/services and quite a few configuration files, so I'm not
> going to paste a slew of information here.  If I'm missing pertinent
> information, please let me know exactly what you want to see and I'll
> get it.

No one has an idea about this?  And no, Andreas, I can't move to
4.0 yet.  ;)

Thanks!

Benny


-- 
"No matter how tempted I am with the prospect of unlimited power, I
will not consume any energy field bigger than my head."
                                  -- #22 on Peter Anspach's Evil
                                     Overlord list


------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list