using NPRE to monitor pbs_mom. Error: "NRPE: Unable to read output"

Rahul Nabar rpnabar at gmail.com
Thu Jan 22 22:05:03 CET 2009


I'm a bit confused about how exactly to add stuff with NRPE to monitor local
services on my remote hosts. I got the basics out of the way and I can
already monitor the easy stuff like users, procs, swap etc.

More ambitiously, I wanted to monitor the status of my "pbs_mom" (Torque
Scheduler daemon) on each node in my cluster. I found the script
check_pbsmom.sh on the NagiosExchange (snippet below) and copied it to my
/usr/local/nagios/libexec.

Then I added this line to my nrpe.cfg

command[check_pbsmom]=/usr/local/nagios/libexec/check_pbsmom

But then I don't seem to have much success.
remotehost>/usr/local/nagios/libexec/check_nrpe -H localhost -c check_pbsmom
NRPE: Unable to read output

If I just run the shell script though it seems to be working
/usr/local/nagios/libexec/check_pbsmom.sh
PBS_MOM OK:  Daemon is running.  Host is listening.


What am I doing wrong here! I'm still a bit confused about the interaction
between command.cfg on the monitoring machine and the nrpe.cfg on the remote
host.

Any advice?

-- 
Rahul

#!/bin/bash
# SYNOPSIS
#       check_pbsmom [<TCP port>] [<TCP port>] ...
#
# DESCRIPTION
#       This NAGIOS plugin checks whether: 1) pbs_mom is running and
#       2) the host is listening on the given port(s).  If no port
#       number is specified TCP ports 15002 and 15003 are checked.
#
# AUTHOR
#       Wayne.Mallett at jcu.edu.au

OK=0
WARN=1
CRITICAL=2
PATH="/bin:/sbin:/usr/bin:/usr/sbin"

# Default listening ports are TCP 15004 and 42559.
if [ $# -lt 1 ] ; then
  list="15002 15003"
else
  list="$*"
fi

if [ `ps -C pbs_mom | wc -l` -lt 2 ]; then
  echo "PBS_MOM CRITICAL:  Daemon is NOT running!"
  exit $CRITICAL
else
  for port in $list ; do
    if [ `netstat -ln | grep -E "tcp.*:$port" | wc -l` -lt 1 ]; then
      echo "PBS_MOM CRITICAL:  Host is NOT listening on TCP port $port!"
      exit $CRITICAL
    fi
  done
  echo "PBS_MOM OK:  Daemon is running.  Host is listening."
  exit $OK
fi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090122/a111d5b8/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list