Distributed monitoring Freshness checking failing then recovering

Sean McAvoy smcavoy at ca.afilias.info
Mon Oct 15 20:09:12 CEST 2007


On further investigations it looks as though the problem is with the  
time taken to submit the results back to nagios via send_nsca.
I have read about a couple different options for getting results back  
quickly. One being a bulk system of transfer, a file containing the  
results is sent via a send_nsca bulk transfer executed via cron. The  
other being a system that makes use of the performance data output  
option on the remote nagios systems and submits the results using a  
custom daemon on both ends.
Does anybody know of any other options? Also, is there any guides to  
setting up either of these options, most of what I have read is email  
threads..
Thanks.

On 12-Oct-07, at 12:40 PM, Sean McAvoy wrote:

> Hello,
> I have 1 central nagios system with 5 distributed servers. I have
> enabled freshness checking on both central and remote systems. I am
> constantly seeing services go to unknown status for 1-3 minutes and
> then recover.
> on the remotes I have:
> check_service_freshness=1
> service_freshness_check_interval=10
> check_host_freshness=1
> host_freshness_check_interval=60
> service_inter_check_delay_method=s
> max_service_check_spread=10
> service_interleave_factor=1
> host_inter_check_delay_method=s
> max_host_check_spread=30
> max_concurrent_checks=0
>
> It does appear as though checks are being run in parallel. I'm wonder
> how I can best determine where the problem is, with the execution of
> checks, submittal to the central system or other.
> Thanks.
>
>
> _sean
>
> ---------------------------------------------------------------------- 
> ---
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a  
> browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when  
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null

Sean McAvoy
NOC Acting Team Lead
Afilias Canada

P. 416.673.4194




-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list