check_dns, check_ssh seem to time out....

Andreas Ericsson ae at op5.se
Tue Jul 6 13:32:06 CEST 2004


Karl DeBisschop wrote:
> nemir nemiria wrote:
> 
>> Hiya.
>>
>> I am monitoring two separate data centres,  so I set up one nagios 
>> server in each to monitor all the local servers and each other.
>>
>> I figure that gives me pretty good coverage.   I use gnokii to send 
>> sms messages, I don't want to get things doubled up so much.   This 
>> works pretty great.
>>
>> I setup nagios under OpenBSD 3.5 in the first data centre without a 
>> hitch. 
>> Now I am setting up a similar setup in the second data centre, but 
>> under fedora core 2 this time.  I got pretty much everything workign 
>> great,  except that my check_dns and check_ssh services of the remote 
>> nagios server are giving me errors.
>>
>> When I run them from the command line, they return positive responses.
>>
>> The entries in checkcommands.cfg look like:
>>
>> # 'check_ssh' command definition
>> define command{
>>         command_name    check_ssh
>>         command_line    $USER1$/check_ssh -H $HOSTADDRESS$
>>         }
>> # 'check_dns' command definition
>> define command{
>>         command_name    check_dns
>>         command_line    $USER1$/check_dns -H www.mydomain.com -s 
>> $HOSTADDRESS$
>>         }
>> responses from command line checks are as follows:
>>
>> etc]# ../libexec/check_ssh -H 192.168.141.5
>> SSH OK - OpenSSH_3.8.1 (protocol 1.99)
>> etc]# ../libexec/check_dns -H www.mydomain.com -s 192.168.141.5
>> DNS ok - 0 seconds response time, Address(es) is/are 192.168.141.5
>>
>> (These are the same when I su - nagios)
>>
>> I am confused as to how a plugin could work from the command line,  
>> but not from the nagios daemon.
> 
> 
> If it works from the command line, but not from the daemon, there there 
> is most likely something in your command line environment that is not in 
> the environment of the daemon. Path might be different, agent PID might 
> be missing, ssh_options file that gets read in one case might not in the 
> other case, host might not be in known_hosts file...
> 

The environment is most likely the problem. Since nagios is started by 
root and then drops its own privileges, it keeps all the settings of the 
root-user, but with the privileges of the nagios user. If the checks 
work after 'su - nagios', you can add that very string to your 
init-script (I've rewritten it to include that because of this very 
problem) and have nagios be started by the nagios user.

The command to start nagios should then look something like this;
su - nagios -c "/usr/local/nagios/bin/nagios -c 
/usr/local/nagios/etc/nagios.cfg"

I'll submit a rewritten init-script later today that you can use. Having 
processes drop their own privileges when they never need the elevated 
privs is just plain dumb anyways.

> (Some of these may not be exactly applicable to your case, 'su -' should 
> cover some of them - I'm just trying to remeber some of the things that 
> have come up as solutions when this question has been asked in the past).
> 

-- 
Sourcerer / Andreas Ericsson
OP5 AB
+46 (0)733 709032
andreas.ericsson at op5.se


-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
digital self defense, top technical experts, no vendor pitches, 
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list