negative check latency with Nagios as VM?

Steve Shipway s.shipway at auckland.ac.nz
Wed Aug 22 02:25:01 CEST 2007


> > VMWare themselves advise not to perform any monitoring which is
> > rate-based on the guest, and further say that any monitoring which
polls
> > hardware (eg network card traffic) will cause performance problems,
and
> > also that monitoring CPU and Memory on the guest is pointless and
> > misleading and should be done via the virtualcentre or ESX server
> > itself.
>
>    Can you elaborate a little on VMWare's suggestions?  I'm working
> on a project to do just this, and I'd appreciate any references I
> can pass along to my working group.

OK, basically, there are three issues.

Firstly, anything running on the guest which queries hardware directly
(eg, get the network card counters) causes a 'potentially unsafe'
instruction in the guest, which is passed to the SC for authorisation
and verification.  This therefore slows things down a bit and is a much
higher performance hit than on a standalone box.  So, it's not a good
thing to do.

Secondly, any monitoring of the CPU and Memory will be meaningless,
because the virtualisation gives the guest a wrong impression of things.
You can get a guest thinking it has used 50% CPU, but in fact the ESX
Server will not give the guest more resources.  Much better to monitor
CPU usage and ReadyTime on the ESX Server.  Memory suffers from the
affects of Balloon and ESXSwap memory giving incorrect usage and swap
readings to the guest, and shared/private memory giving incorrect
readings of how much is actually used.  Again, read these at the Server
level to get meaningful data.

Finally, anything rate-based on the guest will be calculated by the
clock, and because of the virtualisation, the guest's clock does not
tick regularly.  Although the minutes will go by evenly, the seconds
won't - you'll get some longer and some shorter.  So, if you measure a
counter, wait ten seconds, measure it again, then take the difference
and divide by 10 it will not reliably give you a per-second rate.  It
will be artificially inflated or reduced depending on how busy the ESX
Server is at that time.  You can, however, retrieve a counter from a
guest and do the rate calculation on a separate server with a
non-virtual clock.

I hope this clarifies the issue...

Steve

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list