NSCA and Latency

Jonathan Call jcall at verio.net
Thu Oct 23 17:08:27 CEST 2008


NSCA just doesn't scale well within Nagios. 

 

You will need to try something like the OCP Daemon mentioned here:
http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon

 

I believe Andreas Ericsson has also written a broker module for NSCA. It
is apparently still in its testing/alpha stages so you would have to
contact that person directly.

 

Jonathan

 

 

________________________________

From: Maxwell,Brady [mailto:maxwellb at oclc.org] 
Sent: Thursday, October 23, 2008 8:42 AM
To: nagios-users at lists.sourceforge.net
Subject: [Nagios-users] NSCA and Latency

 

My Environment:

3 x Dell 2950 Dual DualCore and 8 GB of RAM

One system runs checks against our Linux servers

One runs checks against our Windows servers

We are running SLES10 update 3

Both systems use nsca to send their check results to a third server that
displays the service checks for our operators.

All three systems are on the same vlan but separate cisco switchs.

I am running nsca in daemon mode on the central server with this command

/usr/local/nagios/bin/nsca -c /usr/local/nagios/etc/nsca.cfg -daemon

Nsca.cfg is as follows:

pid_file=/var/run/nsca.pidserver_port=5667#server_address=192.168.1.1nsc
a_user=nagiosnsca_group=nagios#nsca_chroot=/var/run/nagios/rwdebug=1comm
and_file=/usr/local/nagios/var/rw/nagios.cmdalternate_dump_file=/usr/loc
al/nagios/var/rw/nsca.dumpaggregate_writes=1append_to_file=1max_packet_a
ge=300password=xxxxxxxxxxdecryption_method=14

 

I just set the aggregate and append options to try and fix the problem
they were not set before either way the results are the same.

Ok so on the 2 servers doing the checks.... Everything runs fine even
with the OCSP running my send_service_check_results script. My script is
pretty much straight out of the book.

#!/bin/sh# Arguments:# $1 = Hostname of the host (using the $HOSTNAME$
macro)# $2 = Service description of the service (using the $SERVICEDESC$
macro)# $3 = Service status id of the service (using the
$SERVICESTATUSID$ macro)# $4 = Output of the Service Check (using the
$SERVICEOUTPUT$ macro)/bin/echo "$1","$2","$3","N3 - $4" |
/usr/local/nagios/libexec/send_nsca -H 10.10.129.37 -c
/usr/local/nagios/etc/send_nsca.cfg -d ","

Like I said everything is fine on the 2 servers even with OCSP on.
Between the 2 servers we are running about 10k service checks, latency
is very low just a few seconds. However if I turn on the NSCA Deamon on
the central server my latency creeps up to about 1500+ seconds with in
an hour and just gets worse from there on both remotes. The checks that
should run every 5 minutes on the 2 remote servers end up running every
few hours or less. The central server is doing 0 active checks.

I set debug mode and that proved to provide very little insight into the
problem.

CPU and Mem stats are both very low on all three server. The same thing
can be said for the network, network utilization is less than 2% and
there are no errors on the interfaces. Overall hardware utilization is
10% or less on these three systems. 

So my question is has anyone had this kind of problem with NSCA? What am
I missing? Should I be batching my service checks on the remote servers?
Should I be using xinetd for NSCA instead of deamon mode?

Thanks

Brady



This email message is intended for the use of the person to whom it has been sent, and may contain information that is confidential or legally protected. If you are not the intended recipient or have received this message in error, you are not authorized to copy, distribute, or otherwise use this message or its attachments. Please notify the sender immediately by return e-mail and permanently delete this message and any attachments. Verio, Inc. makes no warranty that this email is error or virus free.  Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20081023/93041de5/attachment.html>
-------------- next part --------------
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list