Setting up a passive check problem

Lewis Getschel lgetschel at denver.westerngeco.slb.com
Wed Apr 13 00:08:11 CEST 2005


Sorry to describe so much and then leave out my actual problem...

Being an impatient person I've changed my services.cfg a little... now 
they are:

services.cfg:
define service{
        use                             linux-service
        name                            ibm_disk_array_status
        service_description             ibm_disk_array_status
        active_checks_enabled           0
        passive_checks_enabled          1
        check_command                   check_dummy
        check_freshness                 0
        register                        0
        }

same config- hosts.cfg:
# service definition
define service{
     use           ibm_disk_array_status
     host_name     fs004,fs005,fs006,fs007,fs008
}

commands.cfg:
# 'check_dummy' command definition
define command{
        command_name    check_dummy
        command_line    $USER1$/check_dummy 0
        }

Now, If I understand ...
the idea of  "active_checks_enabled           0",   means do NOT 
actually check anything (don't run the command_line defined).
the idea of  "passive_checks_enabled          1"   means that nagios 
will only get updates that I put into the  command_file 
("/usr/local/nagios/var/rw/nagios.cmd") through another script that is 
called. This much IS working because I see the following line in my 
event log:
[04-12-2005 14:57:15] EXTERNAL COMMAND: 
PROCESS_SERVICE_CHECK_RESULT;fs008;ibm_disk_array_status;0;OK - No 
errors reported


When I look at the scheduling queue it shows that my service 
"ibm_disk_array_status" is scheduled to be run!
fs004    ibm_disk_array_status    04-12-2005 14:34:16    04-12-2005 
14:54:16    ENABLED

When I view my fileserver services, it shows:
fs004 ibm_disk_array_status          OK 04-12-2005 14:34:16 0d 1h 33m 
37s 1/4 Status is OK

The problem is that the "Status is OK" message is coming from the 
check_dummy command, and it _SHOULD_ be "OK - No errors reported" as my 
external command shows.

------------I've done the following commands:---------------
 $ sudo /etc/rc.d/init.d/nagios stop
Stopping network monitor: nagios
$ ps -ef | grep nagios | grep -v grep
$ sudo /etc/rc.d/init.d/nagios start
Starting network monitor: nagios
  PID TTY          TIME CMD
30767 ?        00:00:00 nagios
$ ps -ef | grep nagios | grep -v grep
nagios   30767     1  8 15:05 ?        00:00:00 
/usr/local/nagios/bin/nagios -d /usr/local/nagios/etc/nagios.cfg
$
-----------------------------------------------------------------------
So I don't have an extra copy of nagios running.

Here is what I want to happen:
1) tell nagios to accept passive results for these 5 servers, display 
the last known status value it had for the service
2) don't perform any active checks for whatever I need to specify as a 
command
3) When my script places a status of OK, or CRITICAL (the only 2 cases), 
accept that as the new status value, and notify as appropriate 
until/unless the status is changed or the service is acknowledged.
4) repeat

After all this time, I thought I understood the basic operation of 
Nagios, but it doesn't seem that I do.
(If someone has example configs for a passive service, could you please 
post your file entries so I can see how someone else does it)

Thanks,

Marc Powell wrote:

>  
>
>>-----Original Message-----
>>From: nagios-users-admin at lists.sourceforge.net [mailto:nagios-users-
>>admin at lists.sourceforge.net] On Behalf Of Lewis Getschel
>>Sent: Tuesday, April 12, 2005 11:42 AM
>>To: Nagios Users
>>Subject: [Nagios-users] Setting up a passive check problem
>>
>>All-
>>    After 8 months of tweaking our 1.2 system with active checks (that
>>work fine), I now find myself at a loss to setup a passive "service
>>check".
>>
>>I have 5 file servers in a "farm" that log themselves to a single
>>    
>>
>syslog
>  
>
>>file.
>>I wrote a script that deals with that and can submit the passive
>>    
>>
>result
>  
>
>>to Nagios to be processed.
>>
>>My problem _seems_ to be my understanding of the basic setup for a
>>passive service check.
>>The docs say: "...service checks to Nagios, a service must have
>>    
>>
>already
>  
>
>>been defined in the object configuration file
>><http://nagios.sourceforge.net/docs/1_0/configobject.html>"
>>    
>>
>
>This means that when you submit an entry to the command file, there must
>be a matching host_name and service_description that nagios already
>knows about or it will be ignored.
>
> 
>  
>
>>What "check_command" does a passive service "need"? (it needs a
>>command???) I don't want nagios to _DO_ anything, just accept the
>>passive results from another process.
>>
>>When I tried to leave a check_command out, nagios complains "... check
>>command is NULL"
>>    
>>
>
>As you can see, there must be one defined. What it is depends on if
>you're going to be using active checks or freshness checking or not. If
>you are going to be using them then the command must be valid as nagios
>will actively execute it to determine the state of the service at the
>expiration of the freshness interval.
>
>If you are not using freshness checking than the command can be anything
>you like. I use the same command that is executed on my distributed
>servers for consistency but it could be check_dummy or any other command
>as it will never actually be run. 
>
>
>
>  
>
>>services.cfg:
>>define service{
>>        use                               linux-service
>>        name                            ibm_disk_array_status
>>        service_description             ibm_disk_array_status
>>        active_checks_enabled           0
>>        passive_checks_enabled          1
>>        check_command                   check_passive_disklog
>>        register                        0
>>        }
>>
>>commands.cfg:
>># 'ibm_disk_array_status' command definition
>>define command{
>>command_name    check_passive_disklog
>>command_line    $USER1$/check_passive_disklog
>>        }
>>
>>hosts.cfg:
>>define service{
>>     use           ibm_disk_array_status
>>     host_name     fs004,fs005,fs006,fs007,fs008
>>}
>>
>>    
>>
>
>I haven't used this type of construct personally but it looks fine.
>
>  
>
>>Can someone point out where I'm going wrong to simply allow a service
>>status to be accepted passively, please.
>>    
>>
>
>Instead of making an assumption about what your problem is, why don't
>you tell us the symptoms and error messages that you are seeing?
>
>--
>Marc
>
>  
>



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list