NSCA problem

Giulio Botto madecto at sangria.org.il
Fri Feb 15 18:55:28 CET 2008


Marc Powell wrote:
> 
>> -----Original Message-----
>> From: nagios-users-bounces at lists.sourceforge.net [mailto:nagios-users-
>> bounces at lists.sourceforge.net] On Behalf Of Giulio Botto
>> Sent: Friday, February 15, 2008 9:54 AM
>> To: nagios-users at lists.sourceforge.net
>> Subject: Re: [Nagios-users] NSCA problem
>>
>> Marc Powell wrote:
> 
>>> This is all good. You don't have a lot of services at all. How
>>> frequently are you sending results? On my own systems I'm easily
>>> processing at least 13 results/sec. I know that there are others
> doing
>>> more but there is some point at which nagios can't keep up.
>> The problem we have seems to lay in the number of nsca daemons on
>> the master machine.
> 
> The number would increase if the NSCA daemons were unable to write to
> the external command pipe. That could be because the results are coming
> in faster than nagios is processing them. The command pipe is only going
> to hold about 4K of data then block until it's cleared.

If I read the docs correctly external_command_buffer_slots=4096 hold
4096 commands in the queue before it starts blocking. That's the value
we have at the moment.

>> Also I do not understand why running nsca with the --single option
>> only processes the first message it receives and discards the rest.
> 
> Nor do I. I run with -s and haven't experienced that problem. Running
> nsca-2.1 here. Have you tried putting nsca into debug mode and
> monitoring that? Running strace on the process would be informative as
> well.

Will do over the weekend: it's something we still haven't had time to
do.

> I did have a similar problem years ago on a machine with a failing disk.
> The failures seemed harmless but ended up causing regular backlogs of
> NSCA processes as you indicate. Fixing the disk problem resolved the
> issue. That's why I asked.

I see how this could cause problems especially since the server is
also a mail content filter, but disks are hardware RAID5 on a Dell Perc
controller monitored by Nagios and they appear fine.

Thanks,
-- 
Giulio Botto -- madecto at sangria.org.il
PGP fingerprint =  1979 A78A 8F82 DB5E 55E9  D6D6 6AB6 0BA9 FDB7 6789

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list