Filtering out false alarms in unreliable network

Jon Angliss jon at netdork.net
Thu Oct 2 09:26:55 CEST 2008


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 1 Oct 2008 18:44:45 +0300 (EEST), "Tuomas Toropainen"
<tuomas.toropainen at lanwan.fi> wrote:

>The problem: how to filter out false alarms caused by short-time breaks in
>an unreliable network.
>
>Think about a simple monitoring scenario in which you only want to ping
>various devices to see if they are up or not. So you have 200 hosts with
>only one service (PING) for each.
>
>For a reason or another, short-time breaks occur in the network. That is,
>a particular host does not reply to PINGs for e.g. 30 seconds. These
>breaks should not cause a notification to be sent.
>
>What comes to services, the filtering is easy with max_check_attempt and
>retry_check_interval. But the host check becomes a problem: after first
>PING failure (soft state) the host is checked, and there is no
>retry_check_interval for hosts. So the host is declared to be down
>(almost) immediately.

I'm going to guess you're using nagios 2.  Nagios 3 has a
retry_interval for both service, and host.  However, that being said,
that just hints to me that the host check will be executed
"max_check_attempts" at the "check_interval" rate (both options exist
on a host).  So set your max_check_attempts to a reasonible number (3
for 3 minutes for example), and check_interval to another reasonible
number (1 for 1 minute), and that should handle 30 second "blips" in
networks.

- -- 
Jon Angliss

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32) - GPGshell v3.64

iEYEARECAAYFAkjkd6wACgkQK4PoFPj9H3P/bACdEPhfz6wPYCAqLpYHcU/wI7JZ
OBgAoOqAadpvYuHSQbnGU5zHkbjl85TQ
=jj8S
-----END PGP SIGNATURE-----


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list