Distributed monitoring Freshness checkingfailing then recovering

Live Great livegreat007 at yahoo.com
Wed Oct 17 01:42:48 CEST 2007


Hi Jonathan,

Why not use check_by_ssh instead? 
Is there any pitfall (weakness) in using check_by_ssh compared agent like OCP?

Thanks
Sam

----- Original Message ----
From: Jonathan Call <jcall at verio.net>
To: Sean McAvoy <smcavoy at ca.afilias.info>; nagios-users at lists.sourceforge.net
Sent: Wednesday, October 17, 2007 7:19:46 AM
Subject: Re: [Nagios-users] Distributed monitoring Freshness checkingfailing then recovering

Sean;

I have a very large deployment so I use this tool:

http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

This daemon runs on each of the distributed servers while a normal ncsa
daemon listens on the central server.
 
Jonathan

> -----Original Message-----
> From: nagios-users-bounces at lists.sourceforge.net [mailto:nagios-users-
> bounces at lists.sourceforge.net] On Behalf Of Sean McAvoy
> Sent: Monday, October 15, 2007 12:09 PM
> To: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] Distributed monitoring Freshness
> checkingfailing then recovering
> 
> On further investigations it looks as though the problem is with the
> time taken to submit the results back to nagios via send_nsca.
> I have read about a couple different options for getting results back
> quickly. One being a bulk system of transfer, a file containing the
> results is sent via a send_nsca bulk transfer executed via cron. The
> other being a system that makes use of the performance data output
> option on the remote nagios systems and submits the results using a
> custom daemon on both ends.
> Does anybody know of any other options? Also, is there any guides to
> setting up either of these options, most of what I have read is email
> threads..
> Thanks.
> 
> On 12-Oct-07, at 12:40 PM, Sean McAvoy wrote:
> 
> > Hello,
> > I have 1 central nagios system with 5 distributed servers. I have
> > enabled freshness checking on both central and remote systems. I am
> > constantly seeing services go to unknown status for 1-3 minutes and
> > then recover.
> > on the remotes I have:
> > check_service_freshness=1
> > service_freshness_check_interval=10
> > check_host_freshness=1
> > host_freshness_check_interval=60
> > service_inter_check_delay_method=s
> > max_service_check_spread=10
> > service_interleave_factor=1
> > host_inter_check_delay_method=s
> > max_host_check_spread=30
> > max_concurrent_checks=0
> >
> > It does appear as though checks are being run in parallel. I'm
wonder
> > how I can best determine where the problem is, with the execution of
> > checks, submittal to the central system or other.
> > Thanks.
> >
> >
> > _sean
> >
> >
----------------------------------------------------------------------
> > ---
> > This SF.net email is sponsored by: Splunk Inc.
> > Still grepping through log files to find problems?  Stop.
> > Now Search log events and configuration files using AJAX and a
> > browser.
> > Download your FREE copy of Splunk now >> http://get.splunk.com/
> > _______________________________________________
> > Nagios-users mailing list
> > Nagios-users at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/nagios-users
> > ::: Please include Nagios version, plugin version (-v) and OS when
> > reporting any issue.
> > ::: Messages without supporting info will risk being sent to
/dev/null
> 
> Sean McAvoy
> NOC Acting Team Lead
> Afilias Canada
> 
> P. 416.673.4194
> 
> 
> 
> 
>
------------------------------------------------------------------------
-
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a
browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20071016/ce70c3d5/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list