<div class="gmail_quote">On Mon, Feb 9, 2009 at 3:58 PM, Max <<a href="mailto:perldork@webwizarddesign.com">perldork@webwizarddesign.com</a>> wrote: <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"> Rahul, <div class="Ih2E3d"> On Mon, Feb 9, 2009 at 2:53 PM, Rahul Nabar <<a href="mailto:rpnabar@gmail.com">rpnabar@gmail.com</a>> wrote: > Thanks Marc. I have: max_concurrent_checks=0 </div>Our experience has been that with max_concurrent_checks set to 0 and inter-check delay and nagios sleep set very low we get high reported service check latencies as we are basically asking Nagios to try and run everything as soon as possible ... 1000s of checks over a few seconds in essence ... which it can't do. As far as 'real life' negative impact the high latency in this singular case hasn't meant much; it initially really worried me until i realized that the high service latency is just happening because we are basically telling nagios to pause / sleep / wait for as little time as possible and run things as quickly as possible. We have around a 146 second service check latency but from our detailed Nagios metrics we see that check runs are completing in right around 4 minutes, under our 5 minute hard-ceiling (around 6000 checks). our PNP performance graphs prove our suspicions .. our reporting server receives 6000 metrics in 4 minutes or less and we have no gaps in our graphs or major under or over sampling problems with the data we retrieve from our remote agents. I only bring that up because if you not only have max_concurrent_checks set to 0 but also have tuned way down inter-check delay settings and sleep time you might be encountering the same situation and the high latency might not be something to worry about .. but only IF you have all your delays tuned very low and no ceiling on max checks. for any other situation it is definitely something to investigate.</blockquote><div> Thanks Max. That is a pretty intricate issue that I had no idea about! I'm still trying to figure out the exact implications of what you describe. Maybe I need to visit the Nagios manual again to re-read nagios's scheduling logic. It's especially important to me now that I also have PnP running performance stats. Meanwhile this is a dump of the relevant parameters you speak about. I don't recall changing any from their defaults. Maybe I ought to in the light of what you mentioned? service_inter_check_delay_method=s host_inter_check_delay_method=s sleep_time=0.25 #Timeouts: service_check_timeout=60 host_check_timeout=30 event_handler_timeout=30 notification_timeout=30 ocsp_timeout=5 perfdata_timeout=5 </div></div>-- Rahul