Setting up a passive check problem

Lewis Getschel lgetschel at denver.westerngeco.slb.com
Thu Apr 14 01:46:35 CEST 2005


Thanks to everyone for your input! (Ain't it great when we all help each 
other!)

I've finally 'solved' my passive check issues.

To summarize the fixes:
1) _A_ big issue was that in nagios.cfg it WAS set to 
"accept_passive_service_checks=0", So all _MY_ entries were being 
ignored. (bad computer! <smirk>)
    (even though my service had them enabled, the "system" wasn't 
accepting them because of that)
accept_passive_service_checks=1  Now it accepts them (good computer!)

2) State retention may very well have been in issue, but losing all that 
data whenever I made a change to the configs and restarted (and getting 
the 27 "down" notices made me re-think this ) so I keep it.

3) I can't explain _why_ nagios wanted to execute my command even though 
set to "active_checks_enabled           0", but setting the 
"check_period                    none" solved that.
check_period                    none
     Now it doesn't schedule any checks (I just have that 'annoying' RED 
"5 services disabled" on the Tactical Overview page, I can live with that)

My current working service definition is:
define service{
        use                             linux-service
        name                            ibm_diskarray_status
        service_description             ibm_diskarray_status
        active_checks_enabled           0
        passive_checks_enabled          1
        check_command                   check_dummy
        retain_status_information       1
        check_period                    none
        register                        0
        }

Now Nagios is doing what I want it to do. (YEA!!!)

On a side note....
Can someone explain the idea of "register". As far as I can tell, since 
I have "register  0" in my templates, nothing is registered. When would 
I want to register something, and what does it get me?

When I _try_ to register this service (as above) (with a 1), when I 
reload, I get:
          Error: Service description, host name, or check command is NULL
          Error: Could not register service (config file 
'/usr/local/nagios/etc/general/services.cfg', line 345)
When I change back to zero, it reloads fine...
 ... Just wondering what I'm "missing".

Thanks again to all.


Marc Powell wrote:

>  
>
>>-----Original Message-----
>>From: Lewis Getschel [mailto:lgetschel at denver.westerngeco.slb.com]
>>Sent: Tuesday, April 12, 2005 5:08 PM
>>To: Marc Powell
>>Cc: Nagios Users
>>Subject: Re: [Nagios-users] Setting up a passive check problem
>>
>>Sorry to describe so much and then leave out my actual problem...
>>
>>Being an impatient person I've changed my services.cfg a little... now
>>they are:
>>
>>services.cfg:
>>define service{
>>        use                             linux-service
>>        name                            ibm_disk_array_status
>>        service_description             ibm_disk_array_status
>>        active_checks_enabled           0
>>        passive_checks_enabled          1
>>        check_command                   check_dummy
>>        check_freshness                 0
>>        register                        0
>>        }
>>
>>same config- hosts.cfg:
>># service definition
>>define service{
>>     use           ibm_disk_array_status
>>     host_name     fs004,fs005,fs006,fs007,fs008
>>}
>>
>>commands.cfg:
>># 'check_dummy' command definition
>>define command{
>>        command_name    check_dummy
>>        command_line    $USER1$/check_dummy 0
>>        }
>>    
>>
>
>Yup. Still looks ok.
>
>  
>
>>Now, If I understand ...
>>the idea of  "active_checks_enabled           0",   means do NOT
>>actually check anything (don't run the command_line defined).
>>the idea of  "passive_checks_enabled          1"   means that nagios
>>will only get updates that I put into the  command_file
>>("/usr/local/nagios/var/rw/nagios.cmd") through another script that is
>>    
>>
>
>Correct. Freshness checking will ignore the value of
>active_checks_enabled I believe. That would only come into play if
>you've enabled freshness checking of course.
>
>  
>
>>called. This much IS working because I see the following line in my
>>event log:
>>[04-12-2005 14:57:15] EXTERNAL COMMAND:
>>PROCESS_SERVICE_CHECK_RESULT;fs008;ibm_disk_array_status;0;OK - No
>>errors reported
>>
>>
>>    
>>
>
>This indicates that nagios saw an external command, not necessarily that
>it accepted it. I'm going to guess it did as the next line would have
>been an error of some type if nagios rejected it.
>
>  
>
>>When I look at the scheduling queue it shows that my service
>>"ibm_disk_array_status" is scheduled to be run!
>>fs004    ibm_disk_array_status    04-12-2005 14:34:16    04-12-2005
>>14:54:16    ENABLED
>>
>>When I view my fileserver services, it shows:
>>fs004 ibm_disk_array_status          OK 04-12-2005 14:34:16 0d 1h 33m
>>37s 1/4 Status is OK
>>
>>The problem is that the "Status is OK" message is coming from the
>>check_dummy command, and it _SHOULD_ be "OK - No errors reported" as
>>    
>>
>my
>  
>
>>external command shows.
>>    
>>
>
>This could be explained if you have state retention enabled in
>nagios.cfg. See the notes on Retention at
>http://nagios.sourceforge.net/docs/1_0/xodtemplate.html.
>
>  
>
>>------------I've done the following commands:---------------
>> $ sudo /etc/rc.d/init.d/nagios stop
>>Stopping network monitor: nagios
>>$ ps -ef | grep nagios | grep -v grep
>>$ sudo /etc/rc.d/init.d/nagios start
>>Starting network monitor: nagios
>>  PID TTY          TIME CMD
>>30767 ?        00:00:00 nagios
>>$ ps -ef | grep nagios | grep -v grep
>>nagios   30767     1  8 15:05 ?        00:00:00
>>/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
>>$
>>
>>    
>>
>-----------------------------------------------------------------------
>  
>
>>So I don't have an extra copy of nagios running.
>>    
>>
>
>Good thinking. It's a common problem.
>
>  
>
>>Here is what I want to happen:
>>1) tell nagios to accept passive results for these 5 servers, display
>>the last known status value it had for the service
>>    
>>
>
>Looks like you've got that configured properly.
>
>  
>
>>2) don't perform any active checks for whatever I need to specify as a
>>command
>>    
>>
>
>Again, it looks like you have that configured properly.
>
>  
>
>>3) When my script places a status of OK, or CRITICAL (the only 2
>>    
>>
>cases),
>  
>
>>accept that as the new status value, and notify as appropriate
>>until/unless the status is changed or the service is acknowledged.
>>    
>>
>
>This will happen as a natural occurrence of submitting passive checks.
>
>  
>
>>4) repeat
>>
>>After all this time, I thought I understood the basic operation of
>>Nagios, but it doesn't seem that I do.
>>    
>>
>
>You're close. I'll bet it's state retention that's throwing you, based
>on the information so far.
>
>  
>
>>(If someone has example configs for a passive service, could you
>>    
>>
>please
>  
>
>>post your file entries so I can see how someone else does it)
>>    
>>
>
>Here's how I do it. Note that I have active checks enabled but the
>check_period to none. That prevents the annoying X from being displayed
>in the GUI but the command still never gets run as an active check.
>
># Generic service definition template
>define service{
>        name                            generic-service
>        active_checks_enabled           1       ; Active service checks
>are enabled
>        passive_checks_enabled          1       ; Passive service checks
>are enabled/accepted
>        parallelize_check               1       ; Active service checks
>should be parallelized
>        obsess_over_service             0       ; We should obsess over
>this service (if necessary)
>        check_freshness                 0       ; Default is to NOT
>check service 'freshness'
>        notifications_enabled           1       ; Service notifications
>are enabled
>        event_handler_enabled           1       ; Service event handler
>is enabled
>        flap_detection_enabled          1       ; Flap detection is
>enabled
>        process_perf_data               0       ; Process performance
>data
>        retain_status_information       1       ; Retain status
>information across program restarts
>        retain_nonstatus_information    1       ; Retain non-status
>information across program restarts
>        is_volatile                     0
>        check_period                    none
>        max_check_attempts              4
>        normal_check_interval           5
>        retry_check_interval            3
>        notification_interval           10080
>        notification_period             none
>        notification_options            c,r
>
>        register                        0       ; DONT REGISTER THIS
>DEFINITION - ITS NOT A REAL SERVICE, JUST A TEMPLATE!
>        }
>
># Host definition
>define host {
>        use                     generic-host
>        host_name               host-name
>        alias                   The Renaissance Center
>        address                 <ip address removed>
>        }
>
>#Service definition
>define service {
>        use                     generic-service
>        host_name               host-name
>        service_description     PING
>        contact_groups          tnops
>        check_command           check_ping
>        }
>
># 'check_ping' command definition
>define command{
>        command_name    check_ping
>        command_line    $USER1$/check_ping $HOSTADDRESS$ 30 60 500.0
>1000.0 -p 10 -t 30
>        }
>
>--
>Marc
>  
>

-- 
Lewis Getschel             | Today is done...
WesternGeco                |     Today was fun...
1625 Broadway              |         Tomorrow is another one.
Denver, CO 80202           |
Direct Phone - 303-389-4407|        -- Dr. Seuss --



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list