Problems with distributed monitoring

Sérgio Afonso sergioafonsojr at gmail.com
Fri May 14 20:20:29 CEST 2010


Hello Marcel,

My nagios version is 3.2.0.  About my command_check_interval I
couldn't understand very well what you mean. My command_check_interval
is set to -1

Rgs,

Sérgio.

On Fri, May 14, 2010 at 1:50 PM, Marcel <mitsuto at gmail.com> wrote:
> With only 150 services, it should not delay that much nor stops execution of
> the main process.
> Please check you main nagios.cfg file and look for command_check_interval,
> if the value attributed to that variable isn't "-1" then there is your
> problem.
>
> Also, which nagios version are you running?
>
>
> On Fri, May 14, 2010 at 2:28 PM, Trisha Hoang <trisha at rockyou.com> wrote:
>>
>> Hi Sergio,
>> Some of the directives I found helpful for our MASTER server are listed
>> below.
>>
>> Since status.dat and nagios.cmd are disk bound, put them on ramdisk will
>> be faster.
>> status_file=/mnt/ramdisk/status.dat
>> command_file=/mnt/ramdisk/nagios.cmd
>>
>> I don't think aggressive_host_checking is needed as nagios checks for host
>> when a service is in error anyway.
>> use_aggressive_host_checking=0
>> check_host_freshness=0
>>
>> Service freshness is important as the MASTER tends to process passive
>> checks much slower so the services may go stale. However, since our checks
>> are 5 min interval, having the MASTER wait for the next round of check is
>> fine.
>> check_service_freshness=1
>> service_freshness_check_interval=420
>>
>> We use nagios-3.2.1 and I think these directives are still experimental
>> but they seem to help. You will see defunct nagios processes that come and
>> go. I think it's caused by child forked once instead of twice so one gets
>> killed (my theory), but again, it seems to be running ok.
>> use_large_installation_tweaks=0
>> child_processes_fork_twice=0
>>
>> Our MASTER receives ~7000 passive checks from the SLAVE but it could only
>> process max ~5000 passive checks per 5 min. The latency is about <10 secs.
>> For the rest, the MASTER actively checks them. If you or someone knows a way
>> to improve passive check processing, that will be great.
>>
>> Also, in our setup, we don't use NSCA. The slaves have
>> ocsp_command=send_service_check where this command inserts the checks into a
>> file that gets sent every 5 sec to the master. On the master, there's a
>> script that opens this file and inserts the lines directly into the
>> nagios.cmd pipe every 5 sec.
>>
>> Trisha
>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>
>
> ------------------------------------------------------------------------------
>
>
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting
> any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
>

------------------------------------------------------------------------------

_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list