Unneeded alerts from Nagios

Marc Powell marc at ena.com
Wed Dec 5 21:39:48 CET 2007



> -----Original Message-----
> From: nagios-users-bounces at lists.sourceforge.net [mailto:nagios-users-
> bounces at lists.sourceforge.net] On Behalf Of Monappallil, George
> Sent: Wednesday, December 05, 2007 2:17 PM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Unneeded alerts from Nagios
> 
> hi:
> I have a nagios 2.9 instance running on an ESX linux guest. The
problem we
> are seeing is that whenever we lose and regain network connectivity to
the
> host, nagios wrongly sends a bunch of server down and server up alerts
for
> all the servers that nagios is monitoring.
> this is how my hosts.cfg looks like for a typical hosts
> define host{
>         name                            generic-host    ; Generic
template
> name
>         notifications_enabled           1               ; Host
> notifications are enabled
>         event_handler_enabled           1               ; Host event
> handler is enabled
>         flap_detection_enabled          1               ; Flap
detection
> is enabled
>         process_perf_data               1               ; Process
> performance data
>         retain_status_information       1               ; Retain
status
> information
>         retain_nonstatus_information    1               ; Retain non-
> status information
>         register                        0               ; DONT
REGISTER
> THIS DEFINITION
>         }
> 
> # This creates a generic host that your routers can use
> # monitors host(s) 24x7, notifies on down and recovery, checks 15
times
> before going critical,
> # notifies the contact_group every 30 minutes
> define host{
>         name                    basic-host
>         use                     generic-host
>         check_command           check-host-alive
>         max_check_attempts      10
>         notification_interval   30
>         notification_period     24x7
>         notification_options    d,r
>         register                0
>         }
> 
> #adelphi
> define host{
>         use                     basic-host
>         host_name               adelphi
>         alias                   adelphi
>         address                 172.xx.xx.xx (intentional)
>         contact_groups          rpfl-it
>         }


> the question I have is why would nagios send DOWN/UP alerts for all
the
> hosts it is monitoring when it is just the host that it is on loses
> connectivity.

The question is why is this surprising? Your description is that the
machine nagios is running on loses network connectivity. Nagios can not
reach network hosts that it is monitoring so it believes them to be down
and sends notifications. You've not given nagios any way to tell
otherwise.

If you're unable to create a more stable environment for nagios
(generally a mission-critical service), I'd recommend creating a
host/service check for the default gateway and set that as the parent
for all your other hosts. If the hosts become unreachable, nagios will
verify if the default gateway is down and notify appropriately.

--
Marc

-------------------------------------------------------------------------
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list