Flap detection for traps ? Does it work ?

Stanley Hopcroft Stanley.Hopcroft at IPAustralia.Gov.AU
Tue May 20 09:57:04 CEST 2003


Dear Ladies and Gentlemen,

I am writing to request comments on using Nagios 'flap detection' (with 
standard thresholds) with a service that is checked passively (by a 
traphandler injecting a PROCESS_SERVICE_CHECK_RESULT into the Nagios 
command queue).

I was hoping to deal with the case where traps pour in with alternating 
UP and DOWN values leading to a notification cascade.

Here is one of the offending service definitions

define service{
        use                             generic-service

        host_name                       ServerIron
        service_description             SLB castor port reachability 
trap
        check_period                    none
        notification_period             24x7
        contact_groups                  network-admins
        max_check_attempts              1
        flap_detection_enabled          1
        check_command                   check_ping
        }

and here is what happens after simulating some traps

Tue May 20 17:13:35 EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;ServerIron;SLB castor port reachability 
trap;2;Failed. SLB cannot reach port 389 on real server (server failure) 
castor (10.0.100.11).

Tue May 20 17:13:35 EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;ServerIron;SLB castor port reachability 
trap;0;Ok. SLB can reach port 389 on real server castor (10.0.100.11).

Tue May 20 17:13:35 EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;ServerIron;SLB castor port reachability 
trap;2;Failed. SLB cannot reach port 389 on real server (server failure) 
castor (10.0.100.11).

Tue May 20 17:13:35 EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;ServerIron;SLB castor port reachability 
trap;0;Ok. SLB can reach port 389 on real server castor (10.0.100.11).

Tue May 20 17:13:35 EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;ServerIron;SLB castor port reachability 
trap;2;Failed. SLB cannot reach port 389 on real server (server failure) 
castor (10.0.100.11).

Tue May 20 17:13:35 SERVICE NOTIFICATION: networks;ServerIron;SLB castor 
port reachability trap;CRITICAL;notify-by-epager;Failed. SLB cannot 
reach port 389 on real server (server failure) castor (10.0.100.11).

The global flap detection config directive is enabled (in nagios.cfg).

Obviously I haven't understood the documentation.

Please let me know where to look.

Yours sincerely.
 
-- 
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------

'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'

from Meditation 17, J Donne.


-------------------------------------------------------
This SF.net email is sponsored by: ObjectStore.
If flattening out C++ or Java code to make your application fit in a
relational database is painful, don't do it! Check out ObjectStore.
Now part of Progress Software. http://www.objectstore.net/sourceforge
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list