check_ping vs. check_icmp?

Andrew Laden Andrew.Laden at tudor.com
Fri Oct 14 22:37:17 CEST 2005


Ok, grabbed the latest version of the plugins. (my fault on that, I wanted
the check_icmp only, so I just grabbed the latest of that. Didn't realize
there was a more recent one in the full package.)
 
If I run with the options you suggest, then I do see the performance
difference. It's the "out of the box" test that is the problem then. Given
the default parameters check_fping runs faster on a host down then
check_icmp. You have to tweak it for better performance.

# time ./check_host -H 10.8.10.201
10.8.10.201 is DOWN - rta: nan, lost 100%|pkt=6;5;5;5;5 pl=100%;95;100;;
    3.05s real     0.00s user     0.00s system
# time ./check_fping -H 10.8.10.201
FPING CRITICAL - 10.8.10.201 (loss=100% )|loss=100%;;;0;100
    0.51s real     0.01s user     0.00s system
# time ./check_host -i 100 -w 50.0,%20 -c 100.0,40% -p 1 -H 10.8.10.201
10.8.10.201 is DOWN - rta: nan, lost 100%|pkt=2;1;1;1;1 pl=100%;95;100;;
    0.26s real     0.00s user     0.00s system

 I am more worried about the host down state. Yes a host check will run when
any service check fails. But on a host that is up, they both return
relativly quickly. It's the host down case that will concern me, as that is
what will slow nagios down.

Guess I have to play with the numbers a bit.



-----Original Message-----
From: Andreas Ericsson [mailto:ae at op5.se] 
Sent: Friday, October 14, 2005 3:37 PM
To: 'nagios-users at lists.sourceforge.net'
Subject: Re: [Nagios-users] check_ping vs. check_icmp?

Andrew Laden wrote:
> How does using check_icmp compare to using check_fping?
> 
> It seems that check_fping will return a down answer much faster. Since 
> host checks are most often run when the host is down, that seems to be 
> the performance that we are concerned with.
> 

This might seem to be the case, but it actually isn't. A hostcheck is run
each time a service changes from whatever to any non-OK state. In a
(somewhat) healthy network hostchecks are being run when the host is up more
often than when they're down. The opposite is ofcourse true if there are
hosts being down for a long time or if a whole segment of the network goes
to lunch, but check_icmp can sometimes deduce this through other means than
by simply not getting any OK responses (it detects routing errors, among
other things).

> # time ./check_fping -H em1.dra.tudor.com FPING CRITICAL - 
> em1.dra.tudor.com (loss=100% )|loss=100%;;;0;100
>     0.52s real     0.00s user     0.01s system
> # time ./check_icmp  -H em1.dra.tudor.com CRITICAL - 
> em1.dra.tudor.com: rta nan, lost 100%|rta=0.000ms;200.000;500.000;0; 
> pl=100%;40;80;;
>     2.96s real     0.00s user     0.00s system
> 


This is due to a couple of different things.
1) A logical error adding critical.rta to the max_completion_time too many
times in check_icmp. This was fixed some time ago. 
http://oss.op5.se/nagios/op5plugins-2005-10-14.tar.gz for fresh code (fourth
time today I post that link....).
With the fix in place, check_icmp finishes closer to 0.7 seconds,

2) check_fping sets critical RTA to 100ms, while check_icmp sets it to
500ms. check_icmp can't possibly finish in 0.5 seconds if it has to wait
0.5 seconds to make sure there are no more packets coming in within the
maximum threshold. For fair testing, you should use check_icmp -i 100 -w
50.0,20% -c 100.0,40% -p 1 check_fping -w 50.0,20% -c 100ms,40% -p 1
check_icmp -i 100 -w 50.0,20% -c 100.0,40% -p 5 check_fping -w 50.0,20% -c
100.0,40% -p 5

Note that when check_icmp is used in check_host mode it sets thresholds very
differently (-w 2s,100% -c 2s,100%) but you can override this with the usual
-w and -c switches. If used with a hostname rather than an IP-address it
also checks *all* ip-addresses connected to the hostname. 
This ofcourse also has an impact on timing.

For some more benefits of check_icmp, you can try running check_host -H
193.201.96.45 and check_host -H oss.op5.se. It'll work the same with
check_icmp in normal mode.

Community question here; Would it be sane to treat an ICMP_PORTUNREACH from
the intended target host as a valid icmp response? For hostchecks only,
perhaps?


> -----Original Message-----
> From: Nate Carlson [mailto:nagios at natecarlson.com]
> Sent: Friday, October 14, 2005 1:16 PM
> To: Andreas Ericsson
> Cc: 'nagios-users at lists.sourceforge.net'
> Subject: Re: [Nagios-users] check_ping vs. check_icmp?
> 
> On Fri, 14 Oct 2005, Andreas Ericsson wrote:
> 
>>check_ping executes the external command ping, while check_icmp does 
>>its own fiddling with the ICMP protocol. As a result, check_icmp is 
>>faster, smarter and requires less resources to run.
>>
>>check_icmp can also be used in check_host mode (create a symlink 
>>check_host -> check_icmp and execute check_host) which runs extremely 
>>quickly to determine if a host is up whenever a service check fails.
>>Ordinary check_ping would take 5 seconds to determine that the host is 
>>up in an ordinary setup, while check_host usually does the same trick 
>>in just about the same amount of time as it takes for a packet to make 
>>a round trip to the destination target (usually between 1 and 10 
>>milliseconds on a local network).
>>
>>Considering the fact that service checks aren't executed while host 
>>checks are running, the check_host mode of check_icmp is a fairly 
>>major improvement in terms of overall Nagios performance.
> 
> 
> In other words, check_icmp is certainly worth making the change.  :)
> 
> Thanks - I'll grab the newest version of the plugin pack you mention 
> in later messages, and make the cut!
> 
> ----------------------------------------------------------------------
> --
> | nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
> |       depriving some poor village of its idiot since 1981            |
> ----------------------------------------------------------------------
> --
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by:
> Power Architecture Resource Center: Free content, downloads, 
> discussions, and more. http://solutions.newsforge.com/ibmarch.tmpl
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when 
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by:
> Power Architecture Resource Center: Free content, downloads, 
> discussions, and more. http://solutions.newsforge.com/ibmarch.tmpl
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when
reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
> 

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting
any issue. 
::: Messages without supporting info will risk being sent to /dev/null


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list