softstates and retry check intervals

Marc Powell marc at ena.com
Thu Jan 5 16:30:17 CET 2006



> -----Original Message-----
> From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
> admin at lists.sourceforge.net] On Behalf Of Tharanga
> Sent: Wednesday, January 04, 2006 8:03 PM
> To: Danny Russell; nagios-users at lists.sourceforge.net
> Subject: Re: [Nagios-users] softstates and retry check intervals
> 
> Hello Danny,
> 
> Thxs for the immeidate reply. actualy i need to do this
> "In order to prevent false alarms, Nagios allows you to define how
many
> times a service or host check will be retried before the service or
host
> is
> considered to have a real problem."
> 
> but my one process went to NRPE time out..it gives critical alert..but
its
> a
> false alarm. if nagios can check that service 3 times..if all three
times
> fails..it should alert...
> this is my servises.cfg
> 
> 
> define service{
>        use                             generic-service         ; Name
of
> service template to use
> 
>        host_name                       Linux-PBX
>        service_description             PBX-Asterisk process
>        is_volatile                     0
>        check_period                    24x7
>        max_check_attempts              3
>        normal_check_interval           1
>        retry_check_interval            1
>        contact_groups                  linux-admins
>        notification_interval           240
>        notification_period             24x7
>        notification_options            w,u,c,r
>        check_command                   check_snmp_process!asterisk
>        }

This should do exactly what you want, 3 checks at 1 minute intervals
before notification. This _does_ work with all versions of Nagios that
I've ever used (years and years). Are you sure this is the active
config? Could you possibly have a copy of nagios running that is using
an older variation of the config files? Stop nagios, make sure they're
all dead with ps then restart nagios.

Another strong possibility is that your host check for Linux-PBX is bad.
The first time Nagios receives a non-OK response for the PBX-Asterisk
process service it will attempt to run the host check_command to make
sure that the host isn't down. If _that_ returns non-OK you'll receive
an immediate host notification (if enabled) and notifications for the
services on that host will be suppressed. Are you receiving a host alert
or a service alert? What does the host definition and it's associated
command definition look like?

--
Marc


-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list