how is "Service check Latency" defined in nagios?

Rahul Nabar rpnabar at gmail.com
Mon Feb 9 23:32:43 CET 2009


On Mon, Feb 9, 2009 at 3:58 PM, Max <perldork at webwizarddesign.com> wrote:

> Rahul,
>
> On Mon, Feb 9, 2009 at 2:53 PM, Rahul Nabar <rpnabar at gmail.com> wrote:
> > Thanks Marc. I have: max_concurrent_checks=0
>
> Our experience has been that with max_concurrent_checks set to 0 and
> inter-check delay and nagios sleep set very low we get high reported
> service check latencies as we are basically asking Nagios to try and
> run everything as soon as possible ... 1000s of checks over a few
> seconds in essence ... which it can't do.   As far as 'real life'
> negative impact the high latency in this singular case hasn't meant
> much; it initially really worried me until i realized that the high
> service latency is just happening because we are basically telling
> nagios to pause / sleep / wait for as little time as possible and run
> things as quickly as possible.  We have around a 146 second service
> check latency but from our detailed Nagios metrics we see that check
> runs are completing in right around 4 minutes, under our 5 minute
> hard-ceiling (around 6000 checks).  our PNP performance graphs prove
> our suspicions .. our reporting server receives 6000 metrics in 4
> minutes or less and we have no gaps in our graphs or major under or
> over sampling problems with the data we retrieve from our remote
> agents.
>
> I only bring that up because if you not only have
> max_concurrent_checks set to 0 but also have tuned way down
> inter-check delay settings and sleep time you might be encountering
> the same situation and the high latency might not be something to
> worry about .. but only IF you have all your delays tuned very low and
> no ceiling on max checks.  for any other situation it is definitely
> something to investigate.



Thanks Max. That is a pretty intricate issue that I had no idea about! I'm
still trying to figure out the exact implications of what you describe.
Maybe I need to visit the Nagios manual again to re-read nagios's scheduling
logic. It's especially important to me now that I also have PnP running
performance stats.

Meanwhile this is a dump of the relevant parameters you speak about. I don't
recall changing any from their defaults.
Maybe I ought to in the light of what you mentioned?

service_inter_check_delay_method=s
host_inter_check_delay_method=s
sleep_time=0.25

#Timeouts:
service_check_timeout=60
host_check_timeout=30
event_handler_timeout=30
notification_timeout=30
ocsp_timeout=5
perfdata_timeout=5

-- 
Rahul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090209/da1493e6/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list