high latency

Frost, Mark {PBC} mark.frost1 at pepsico.com
Tue Dec 7 20:20:59 CET 2010


> -----Original Message-----
> From: Andreas Ericsson [mailto:ae at op5.se] 
> Sent: Tuesday, December 07, 2010 9:44 AM
> 
> > Hmm.  So then I'd be so curious why the 2 distservers which are both using
> > oc[sh]p commands the same way have such radically different latencies.
> > 
>
> Agreed. There must be other differences too. Perhaps there's trouble resolving
> from one of the nodes? That usually makes checks run a helluva lot longer than
> they normally have to.

I had another look.  While I found a test host that I'd made that was
deliberately unreachable, I found that when I removed it it made no
difference.  Execution times are significantly lower (min/max/avg) on
the host with the high latencies than for the one with low latencies.
I don't see any unresolvable hosts or now, any unreachable hosts.
Puzzling.

I've always wished there was an easy way to see which processes had
high latencies from the web interface without having to view the status.dat
file...

> > Either way, you're suggesting that having a NEB module handle the
> > post-check work will eliminate the serialization.

> Yes. Sneaking a peak at what's needed in order for an event to get sent to
> master via an eventbroker compared to running an oc[sh]p command renders
> this, more or less:

> [ good stuff snipped...]

Wow.

> In terms of effort, the difference is sort of like either hopping on one
> leg along the entire great wall of china or walking to the kitchen and grab
> a beer.

> > 
> > parallelize_check is set to 1 everywhere.
> 
> Does one server have a lot of random service failures? On-demand hostchecks are
> still run in parallel.

I don't think so.  Intermittent you mean?  Not as far as I know or can see.

> > > What version of Nagios are you running?
> > 
> > 3.2.1
> 
> I take it upgrading makes no difference?

To 3.2.3?   I'll probably try that on the new servers, but if things work out I may
just move to Merlin + 3.2.4.  I wasn't sure I saw anything in the 3.2.3 release that
I found compelling for us at the time.  As I say, this system now has fairly high
visibility so just trying something like that would involve a rather painful
internal change process.  It's like piloting the QE2 -- I can't change
course very quickly :-)

> > Thanks, Andreas.  I'm hoping to allocate sufficient resources on the new servers
> > to be able to play with Merlin more there.
> 
> It's quite resource-friendly actually. Well, compared to what you're running now
> it's positively feather-light.

I meant more like installing MySQL everywhere, building filesystems to hold the
MySQL data, etc.  Not so much like I need more memory or more CPUs.  I don't
remember seeing anything in the Merlin docs (maybe I missed it), but how
large would the MySQL database need to be?  Pretty small on each box, right?
Like 500MB or less?

> >  Will I be able to have the performance
> > data from a poller be sent up to a NOC for digestion by pnp4nagios?
>
> Yes, but you'll need the threadsafe version of Nagios you can obtain from either
> CVS or git://git.op5.org/nagios.git for performance-data to work. Actually, you
> need that for Merlin to work.

That's part of the plan.  Any chance that the OP5 site will eventually be
configured to allow git through a proxy?  It's of course less convenient to
use snapshot tarballs, but still workable, of course.

> >  It may have
> > been a long time ago, but I thought I remember seeing that performance data was
> > not yet implemented.
> > 
> 
> That was then. This is now :)

Spifftacular!

> > No we'd be using some flavor of SLES.
> > 
> 
> Should work marvellously then.

Thanks as always for your help, Andreas.

Mark

------------------------------------------------------------------------------
What happens now with your Lotus Notes apps - do you make another costly 
upgrade, or settle for being marooned without product support? Time to move
off Lotus Notes and onto the cloud with Force.com, apps are easier to build,
use, and manage than apps on traditional platforms. Sign up for the Lotus 
Notes Migration Kit to learn more. http://p.sf.net/sfu/salesforce-d2d
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list