Performance tuning a host returning results via NSCA

Oliver Hookins oliver.hookins at anchor.com.au
Fri Mar 7 05:56:51 CET 2008


On Thu Mar 06, 2008 at 22:39:41 -0600, Marc Powell wrote:
>
>On Mar 6, 2008, at 8:17 PM, Oliver Hookins wrote:
>
>> Hi all,
>>
>> I guess performance is a constant problem for everyone but what I'm  
>> seeing
>> doesn't seem to make sense. I have two servers running Nagios, one  
>> that is
>> more or less just a frontend and another doing the checks and  
>> returning
>> results via send_nsca. Constantly I see the frontend light up with  
>> criticals
>> due to the passive results not being received in time (the service
>> freshness timeout is 120 seconds).
>
>How many and are they always the same? What version of nagios?

The host doing the checks is 2.10 and the frontend is 2.6. There are
anything from just one critical service to dozens. The freshness timer
expires then I have a dummy active check which always returns critical and
mentions something about the freshness timer expiring.

The actual services that return critical in these cases are always
different.

>>
>>
>> I only have 120 passive service checks and 45 passive host checks,  
>> so I can
>> assume if none of the hosts are down it is only doing one check per  
>> second.
>
>Nagios doesn't distribute them that evenly but it tries to. I assume  
>that you haven't done anything to prevent the parallelization of  
>service checks. Is your normal_check_interval exactly 120 seconds? If  
>so, you'll have some checks that happen at that time and their results  
>would be received by the central host after your freshness timeout.  
>Also, do you have a lot of host volatility?

normal_check_interval is 60 seconds for all service checks. Host volatility
is pretty low, if any. In fact most of the hosts this system monitors are
very stable.

>I'd also check communication between the remote and central. Try  
>sending passive results manually from the command line to make sure  
>they complete in a timely manner (should be fractions of a second).

It's a WAN link with fairly high latency and low bandwidth, but according to
iptraf the amount of data being transferred is low anyway. I guess this is
what I was driving at in my original post - does Nagios only ever call
send_nsca serially? If the service checks are done in parallel I would have
thought send_nsca would be called in parallel as well.

-- 
Regards,
Oliver Hookins
Anchor Systems

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list