Severe peformance issue during major network outage

Aidan Anderson mail at aidananderson.co.uk
Fri May 11 23:09:12 CEST 2007


Ton Voon wrote:
> On 11 May 2007, at 20:25, Aidan Anderson wrote:
>
>   
>> First of all, thank-you for the replies!
>>
>> The majority of devices that I monitor are routers/vpn devices and I
>> have (on the documentation's advice) not set active checks on the  
>> hosts
>> and instead I've added check_ping as a service on each of these  
>> hosts to
>> do 5 pings as follows:
>>
>> check_ping -H $HOSTADDRESS$ -w $ARG1$ -c $ARG2$ -p 5
>>
>> For the host check I already use as you suggested a check_ping that  
>> only
>> does one ping as follows:
>>
>> check_ping -H $HOSTADDRESS$ -w 3000.0,80% -c 5000.0,100% -p 1
>>
>> My understanding was that if the service check failed it would then
>> abandon the service check altogether and move onto the host check  
>> which
>> is only 1 ping.  The fact that the service checks are parallelised
>> should mean that it shouldn't matter that there are 5 pings and the  
>> host
>> check is only 1 ping which should resolve the bottleneck of serialised
>> host checks.  I'm at a loss as to why performance has been impacted so
>> severely.
>>
>> Maybe I need to abandon the service checks altogether and just have a
>> host check.  I'm reluctant to do this because I get very useful
>> information from 5 pings, ie packet loss and high rta which is
>> particularly handy for checking volatile links such as ADSL.  Maybe  
>> that
>> is the trade-off, fast host checking with no useful stats or slow host
>> checking with useful stats.
>>     
>
> Just noticed this in your original email:
>
> Host Check Execution Time:       0.03   / 10.04   / 0.843 sec
>
> This means that some of your host checks are taking 10 seconds, which  
> is, funnily enough, the timeout period for check_ping. So the -p 1  
> will still take 10 seconds if the routers are not responding.
>
> You can use a timeout flag for check_ping (but is only supported on  
> some OSes). I guess check_icmp is a better bet here.
>
> Ton
>   
Hi Ton,

Well spotted, thank-you.  check_icmp here we come :)

thanks
Aidan


-------------------------------------------------------------------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and take
control of your XML. No limits. Just data. Click to get it now.
http://sourceforge.net/powerbar/db2/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list