Hosts that report down but aren't

Carroll, Jim P [Contractor] jcarro10 at sprintspectrum.com
Thu Nov 7 23:54:39 CET 2002


>From what you've said, there's a flaw in the logic.  Nagios won't go
checking the services if the fundamental host check fails.  Why should it?
It would be creating unnecessary network traffic when the host is in fact
quite down.  (This is a general statement, and not necessarily regarding
your particular situation.)

Having said that, you say that you've disabled the check_ping service.  It
appears you've made the same mistake I made when I first started working
with Nagios.  You created a check_ping service where one really isn't
needed.  But if you check your host definition, you're using the
"check_command check-host-alive" line.  But does this mean you also need to
stop using that?  Well, yes and no.  Yes, you need to stop using that
particular check, but no, you'll need some sort of check.

Ah, but how to solve what you're trying to solve....  Some steps to take:

1) Decide which of the services will be *the* key service to qualify as
"this host is up" or "this host is down".  Since ping isn't an option, I
would lean towards the check_ssh check.

2) Create a new command definition (in checkcommands.cfg).  Call it
something unique, like maybe check-host-alive-ssh.  Cut/paste the check_ssh
definition (to use as a baseline - read the check_ssh help to see if you can
bump up any limits to suit your particular environment) but using this
modified command_name.

3) Go back to your host definition.  Change the check_command to read
"check_command check-host-alive-ssh".

At this point you should be able to restart nagios.  Yes, you'll have
redundant ssh checks, so you can either leave them both in there, or remove
the one you're using under the service checks.  (I'd leave it in, so that
the reports are more meaningful.  It might look odd to have just one host
missing a check_ssh check.)

Let us know how this works out.

jc

> -----Original Message-----
> From: listuser at neo.pittstate.edu [mailto:listuser at neo.pittstate.edu]
> Sent: Thursday, November 07, 2002 3:49 PM
> To: nagios-users at lists.sourceforge.net
> Subject: [Nagios-users] Hosts that report down but aren't
> 
> 
> I'm having a little trouble monitoring a particular host at 
> the moment.
> The host is up as are the 4 services I'm monitoring.  The 4 
> services are
> reported as up.  The host no longer allows ICMPs so I removed the
> check_ping service (#5).  The host still reports itself as down even
> though its 4 services are up.  I thought the host was assumed 
> to be up if
> one of its services was up.  Is that incorrect?  My config 
> for that host
> is below.  Can anyone see any mistakes?  Thanks
> 
> Justin
> 
> ## munged.domain.tld
> define host{
>  use                     generic-host
>  host_name munged
>  alias munged.domain.tld
>  address munged.domain.tld
>  parents aaa.bbb.ccc.ddd
>  check_command check-host-alive
>  max_check_attempts 3
>  checks_enabled 1
>  failure_prediction_enabled 1
>  process_perf_data 1
>  notification_interval 180
>  notification_period 24x7
>  notification_options d,u,r
>  notifications_enabled 1
> }
> ##
> 
> 
> ## SERVICES
> #define service{
> #        host_name               munged
> #        service_description     ping
> #        check_command           check_ping
> #        contact_groups          munged-admins
> #        max_check_attempts      3
> #        normal_check_interval   3
> #        retry_check_interval    2
> #        check_period            24x7
> #        notification_period     24x7
> #        notification_interval   180
> #        notification_options    u,c,r
> #        notifications_enabled   1
> #}
> 
> define service{
>         host_name               munged
>         service_description     SMTP 
>         check_command           check_smtp
>         contact_groups          munged-admins
>         max_check_attempts      3
>         normal_check_interval   3
>         retry_check_interval    2
>         check_period            24x7
>         notification_period     24x7
>         notification_interval   180
>         notification_options    u,c,r
>         notifications_enabled   1 
> }
> 
> define service{
>         host_name               munged
>         service_description     HTTP
>         check_command           check_http!/
>         contact_groups          munged-admins   
>         max_check_attempts      3
>         normal_check_interval   3   
>         retry_check_interval    2
>         check_period            24x7
>         notification_period     24x7
>         notification_interval   180
>         notification_options    u,c,r
>         notifications_enabled   1
> }
> 
> define service{
>         host_name               munged
>         service_description     SSH
>         check_command           check_ssh
>         contact_groups          munged-admins   
>         max_check_attempts      3
>         normal_check_interval   3
>         retry_check_interval    2   
>         check_period            24x7
>         notification_period     24x7 
>         notification_interval   180
>         notification_options    u,c,r
>         notifications_enabled   1
> }
> 
> define service{
>         host_name               munged
>         service_description     munged Proxy
>         check_command           check_tcp!2048
>         contact_groups          munged-admins   
>         max_check_attempts      3
>         normal_check_interval   3
>         retry_check_interval    2   
>         check_period            24x7
>         notification_period     24x7 
>         notification_interval   180
>         notification_options    u,c,r
>         notifications_enabled   1
> }
> 
> 
> 
> 
> -------------------------------------------------------
> This sf.net email is sponsored by: See the NEW Palm 
> Tungsten T handheld. Power & Color in a compact size!
> http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> 


-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm 
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en




More information about the Users mailing list