check_by_ssh error: Remote command execution failed: You don't exist, go away!

Chris Pepper pepper at cbio.mskcc.org
Fri May 8 17:42:26 CEST 2009


	I'm not finding much current documentation on check_by_ssh -- does it work?

	I have set up the nagios account with /bin/sh as its shell on 2 Linux 
Nagios servers and several clients (RHEL/SuSE/Sol10). I installed 
nagios-plugins-1.4.13 on the clients, and set up public key trust for ssh.

	Unfortunately, Nagios remote checks via check_by_ssh fail from both 
servers to all clients. Here's the debug transcript trying to 
check_disks on jean (Solaris 10):

> [1241794800.756298] [128.1] [pid=20907] External Command Type: 7
> [1241794800.756316] [128.1] [pid=20907] Command Entry Time: 1241794800
> [1241794800.756325] [128.1] [pid=20907] Command Arguments: jean.cbio.mskcc.org;remote-disk-space;1241794797
> [1241794800.756343] [016.0] [pid=20907] Scheduling a non-forced, active check of service 'remote-disk-space' on host 'jean.cbio.mskcc.org' @ Fri May  8 10:59:57 2009
> [1241794801.012178] [016.0] [pid=20907] Attempting to run scheduled check of service 'remote-disk-space' on host 'jean.cbio.mskcc.org': check options=0, latency=4.012000
> [1241794801.012227] [016.0] [pid=20907] Checking service 'remote-disk-space' on host 'jean.cbio.mskcc.org'...
> [1241794801.012255] [2048.1] [pid=20907] **** BEGIN MACRO PROCESSING ***********
> [1241794801.012265] [2048.1] [pid=20907] Processing: '$USER1$/check_by_ssh -H $HOSTADDRESS$ -C '/usr/local/nagios/libexec/check_disk -l -uGB''
> [1241794801.012291] [2048.1] [pid=20907]   Done.  Final output: '/usr/local/nagios/libexec/check_by_ssh -H jean.cbio.mskcc.org -C '/usr/local/nagios/libexec/check_disk -l -uGB''
> [1241794801.012300] [2048.1] [pid=20907] **** END MACRO PROCESSING *************
> [1241794801.012367] [016.1] [pid=20907] Check result output will be written to '/usr/local/nagios/var/spool/checkresults/checkXfUAxl' (fd=7)
> [1241794806.141453] [016.1] [pid=20907] Handling check result for service 'remote-disk-space' on host 'jean.cbio.mskcc.org'...
> [1241794806.141465] [016.0] [pid=20907] ** Handling check result for service 'remote-disk-space' on host 'jean.cbio.mskcc.org'...
> [1241794806.141474] [016.1] [pid=20907] HOST: jean.cbio.mskcc.org, SERVICE: remote-disk-space, CHECK TYPE: Active, OPTIONS: 0, SCHEDULED: Yes, RESCHEDULE: Yes, EXITED OK: Yes, RETURN CODE: 3, OUTPUT: Re\nte command execution failed: You don't exist, go away!
> [1241794806.141496] [016.1] [pid=20907] Service is in a non-OK state!
> [1241794806.141506] [016.1] [pid=20907] Host is currently UP, so we'll recheck its state to make sure...
> [1241794806.141515] [016.1] [pid=20907] * Using last known host state: 0
> [1241794806.141525] [016.1] [pid=20907] Current/Max Attempt(s): 3/3
> [1241794806.141534] [016.1] [pid=20907] Service has reached max number of rechecks, so we'll handle the error...
> [1241794806.141543] [016.1] [pid=20907] Checking service 'remote-disk-space' on host 'jean.cbio.mskcc.org' for flapping...
> [1241794806.141553] [016.1] [pid=20907] Service is not flapping (0.00% state change).
> [1241794806.141567] [016.1] [pid=20907] Checking host 'jean.cbio.mskcc.org' for flapping...
> [1241794806.141577] [016.1] [pid=20907] Host is not flapping (0.00% state change).
> [1241794806.141593] [032.0] [pid=20907] ** Service Notification Attempt ** Host: 'jean.cbio.mskcc.org', Service: 'remote-disk-space', Type: 0, Options: 0, Current State: 3, Last Notification: Fri May  8 10:02:42 2009

	But I can run the plugin manually -- nagios at jean trusts nagios at maguro:

> nagios at maguro:~> /usr/local/nagios/libexec/check_by_ssh -H jean.cbio.mskcc.org -C '/usr/local/nagios/libexec/check_disk -l -uGB'
> DISK OK - free space: / 4 GB (55% inode=82%); /etc/svc/volatile 9 GB (99% inode=99%); /lib/libc.so.1 4 GB (55% inode=82%); /var 2 GB (71% inode=95%); /tmp 9 GB (99% inode=99%); /var/run 9 GB (99% inode=99%); /export/home 887 GB (99% inode=99%); /jean 30542 GB (86% inode=99%);| /=3GB;;;0;7 /etc/svc/volatile=0GB;;;0;9 /lib/libc.so.1=3GB;;;0;7 /var=1GB;;;0;3 /tmp=0GB;;;0;9 /var/run=0GB;;;0;9 /export/home=0GB;;;0;897 /jean=4921GB;;;0;35464
> nagios at maguro:~> /usr/local/nagios/libexec/check_by_ssh -H jean.cbio.mskcc.org -C 'id -a'
> uid=108(nagios) gid=108 groups=108

	Any suggestions?

Thanks,

Chris
-- 
Chris Pepper:                <http://cbio.mskcc.org/>
                              <http://www.extrapepperoni.com/>

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list