Too stupid? Services are available, but nagios reports host to be down!

Greg King wgking99 at yahoo.com
Wed Apr 9 08:13:51 CEST 2008


> Message: 8
> Date: Tue, 8 Apr 2008 09:32:25 -0800
> From: Israel Brewster 
> Subject: Re: [Nagios-users] Too stupid? Services are available, but
> nagios reports host to be down!
> To: Heiko Schlittermann 
> Cc: nagios-users 
> Message-ID: 
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes
> 
> On Apr 8, 2008, at 2:50 AM, Heiko Schlittermann wrote:
> > Hello,
> >
> > (using 3.0.1)
> >
> > I've a list of hosts, these hosts are not available for ping, but 
> > normal
> > service checks (SSH, SMTP, ...) work. Nagios reports theses hosts 
> > beeing
> > down! Ugly!
> >
> > If I remember well, older nagios versions "knew" that's enough to see
> > one service on a host to know this host has to be up.
> 
> To a degree, yes- if you aren't actively checking the host (as would 
> appear to be the case from your next paragraph), then as long as all 
> services on the host are listed as ok, nagios assumes the host is 
> still ok (at least once running, I don't know how it behaves on the 
> initial check). However, should any of the services go into a non-ok 
> state, nagios will immediately check the host (using the host 
> check_command), wherupon, in your case, it would determine the host to 
> be down since it can't ping. The state of the other services does not 
> affect this process, so any other services do not change state.
> 
> > The host check_command is the normale 'check-host-alive' (which is
> > pinging), the check_interval is 0 -- why does nagios want to check 
> > that
> > host?
> 
> Because at some point one or more of the services went into a non-ok 
> state.
> 
> > The check_command is inherited from some template, if I try to 
> > override
> > it with no value, nagios complains:
> >
> > Error: Host check command '(null)' specified for host 'diwi/diw' is 
> > not defined anywhere
> 
> Yep- you can't have no value in the check_command directive. If you 
> just want to assume the host is up all the time, you can use the 
> check_dummy plugin (after defining a check_dummy command in your 
> checkcommands.cfg, naturally). Otherwise you'll need to figure out 
> some check Nagios can perform to determine if the host is running, 
> even if that check is just checking one of the services again or 
> something.
> 
> -----------------------------------------------
> Israel Brewster
> Computer Support Technician
> Frontier Flying Service Inc.
> 5245 Airport Industrial Rd
> Fairbanks, AK 99709
> (907) 450-7250 x293
> -----------------------------------------------
> >
> >
> >
> > So - please, could anybody point to my stupidity?
> >
> > Thanks.
> >
> >
> > Best regards from Dresden
> > Viele Gr??e aus Dresden
> > Heiko Schlittermann
 
The normal check_host_alive command is ping, but it might not work for some hosts like firewalls, etc.  For these hosts use NMAP to scan for an open TCP port on the host (ssh or http ports are frequently open), then create a check_host_alive2 that does a check_tcp to that known open port, and override the default check_host_alive for the hosts in question, or create a new group for these hosts and use the new check_host_alive2 command. 
 
Regards,
Greg King
www.wgk-consulting.com


      ____________________________________________________________________________________
You rock. That's why Blockbuster's offering you one month of Blockbuster Total Access, No Cost.  
http://tc.deals.yahoo.com/tc/blockbuster/text5.com

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list