Passive monitoring is running slow?

Jonathan Call jcall at verio.net
Wed May 2 17:07:25 CEST 2007



> -----Original Message-----
> From: Thomas Guyot-Sionnest [mailto:dermoth at aei.ca]
> Sent: Tuesday, May 01, 2007 4:29 PM
> To: Jonathan Call
> Cc: nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] Passive monitoring is running slow?
> 
> On 01/05/07 05:15 PM, Jonathan Call wrote:
> > I have set up a distributed monitoring system per the Nagios
> documentation.
> >
> > I initially tested it out by having the distributed server monitor
only
> 24 or so services on about 8 hosts. There didn't seem to be any
problems.
> >
> > I then cranked it up to 427 services on 81 hosts. I'm watching the
> distributed server right now and there is hardly any system load but
the
> Service Check Latency seems extremely high:
> >
> > Metric			Min.		Max.		Average
> > Check Execution Time:  	0.05 sec	1.67 sec	0.701
sec
> > Check Latency:		60.40 sec	287.36 sec	184.514
sec
> > Percent State Change:	0.00%		0.00%		0.00%
> >
> > This is resulting in 50% or less of the service checks completing in
the
> 5 minutes or less timeframe.
> >
> > The Central server has had no significant change in performance at
all
> and seems to be receiving and processing everything without
difficulty.
> >
> > The nsca server on the central server is running with the following
> arguments:
> > /usr/local/sbin/nsca --daemon -c /usr/local/etc/nsca.cfg
> >
> > The submit_check_result script on the distributed server is right
out of
> the documentation.
> 
> There are many ways to do that; my favorite (obviously since I wrote
it
> :) ) is using the host and service performance data files as named
> pipes, and having a daemon reaping them and batch-sending data to
> send_nsca..
> 
> The howto is here (and I'll be more than happy to answer your
questions
> or get your feedback):
> 
> http://www.nagioscommunity.org/wiki/index.php/OCP_Daemon
> 
> It will require Libevent and the Perl module Event::Lib.
> 
> Thomas

So this is a know design failure in Nagios then? I'm fairly new to
Nagios and I am completely dumbfounded at this. If you can't service
even a quarter (and probably even a tenth) of the amount of hosts and
services on a distributed server than you can on a regular active server
then what is the point of having a distributed model at all?

I will take a look at your batch sending method.

-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list