newbie question...

Andrew Davis nccomp at gmail.com
Mon Jun 1 21:00:11 CEST 2009


Marc Powell wrote:
> On Jun 1, 2009, at 3:09 AM, Arnar Þórarinsson wrote:
>
>   
>> So there is no way of getting just the host down alert when a host  
>> goes down ?
>>
>> To explain a little, lets say I'm monitoring CPU, memory and disk  
>> space on a host.
>> The host goes down and Nagios sends an alert by email for the host  
>> down event and
>> also for the CPU, memory and disk space events.  All I need to know  
>> about this event is that the host is down.
>> Just think that it's not neccessary to send an alert email about  
>> services on a host that is down.
>>     
>
> And so does nagios. As I said earlier, nagios does this automatically.  
> To restate - when a host is down, nagios suppresses all e-mail  
> notifications about that hosts services, but will still display them  
> as down in the GUI. It will only send the host down notification.
>
> The first section of http://nagios.sourceforge.net/docs/2_0/networkreachability.html 
>   states it best. It still applies to 3.x but I haven't found the  
> section that states it as clearly.
>
> --
> Marc
>   
If I'm interpreting your question correctly, you're saying that when one 
of your servers actually goes down, you ARE getting alerts 
(email/SMS/whatever) for more than just the host being down??? I see 
what Marc's saying... he's telling you this shouldn't be. Nagios was 
built to check first that the host is up and reachable, and if its not 
to notify you the host is down, but to not ALERT you about all 
host-dependent tests that are now failing. Nagios will still try all 
tests and fail on them and the web interface will reflect more than just 
the HOST DOWN, but the only email/SMS you get should be for the HOST DOWN.

However, you may need to clarify what you mean by *down*. *Down* does 
not always mean off or 100% non-responsive. In the case of *nix systems 
I've seen quite a few times where a server will hang, fail, or segfault 
but still be reachable over the network. The reason is that parts of the 
OS are in memory and things like pings from remote hosts still respond, 
even though the overall functionality of the host itself is down (ISP's 
get this a lot: host pings, but you can't ssh in, for example). If 
Nagios can ping the host, it will then try the other tests and alert on 
them. Here's a quick way to narrow this down: turn off the server (shut 
down and pull power). The Nagios web interface should show the host down 
and all tests as failing, but the only email/SMS you should get is the 
host down. If you still get emailed/alerted then you might have a 
configuration error. Perhaps you didn't properly define your host checks 
as opposed to service checks? Do you have a check_ping or check_icmp 
host check for each host?

AD
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090601/546ad8a1/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list