NRPE vs. check_by_ssh

Charlie Reddington charlie.reddington at gmail.com
Thu Mar 26 20:03:16 CET 2009


On Mar 26, 2009, at 11:05 AM, Kevin Keane wrote:

> Andreas Ericsson wrote:
>> Kevin Keane wrote:
>>> Christopher McAtackney wrote:
>>>> 2009/3/25 Kevin Keane <subscription at kkeane.com>:
>>>>
>>>>> I think you are comparing apples and oranges here, because in most
>>>>> situations that I can think of, the decision is dictated by the
>>>>> network
>>>>> topology. If you are exclusively on a trusted private network,
>>>>> check_by_ssh really doesn't offer any benefits. Conversely, if  
>>>>> your
>>>>> topology involves the Internet or some other untrusted network  
>>>>> (WiFi),
>>>>> then you wouldn't want NRPE in the first place.
>>>>>
>>>>> The only exception to the above that I can think of is when it
>>>>> comes to
>>>>> deciding between using check_by_ssh over an untrusted network, vs.
>>>>> NRPE
>>>>> through some other kind of tunnel or VPN. But in that case, you'd
>>>>> incur
>>>>> encryption overhead either way, and the comparison is very  
>>>>> different
>>>>> from the question you asked.
>>>>>
>>>>> All that said: I don't have any first-hand experience, but I  
>>>>> suspect
>>>>> that the impact of establishing 2200 ssh connections in a five- 
>>>>> minute
>>>>> span (assuming that you are using a five-minute check interval) is
>>>>> pretty substantial. The main impact actually lies in  
>>>>> establishing and
>>>>> tearing down the connections, key negotiations etc.; the  
>>>>> encryption
>>>>> during the data phase probably has only limited impact because  
>>>>> most
>>>>> checks only transmit a few bytes back and forth.
>>>>>
>>>>> SSH does much better with longer-duration connections when the  
>>>>> keys
>>>>> are
>>>>> already exchanged. This is even more true if you have a router- 
>>>>> based
>>>>> VPN, because in that case the overhead is offloaded to a different
>>>>> machine.
>>>>>
>>>>> So if you have the option of sending the checks as NRPE through  
>>>>> one
>>>>> or a
>>>>> few long-term VPNs: you are probably going to be better off. Of
>>>>> course,
>>>>> in the big picture, your mileage may vary.
>>>>>
>>>> Firstly, thanks for the detailed explanation of the issues  
>>>> involved in
>>>> this choice Kevin, it's been very helpful.
>>>>
>>>> I'm curious though, could you elaborate on why NRPE is unsuitable  
>>>> if
>>>> communication with my remote hosts is going to go via the  
>>>> Internet? Is
>>>> it not sufficient that NRPE uses SSL? This may be more of a network
>>>> security question than a Nagios one, but I've no real experience in
>>>> either area unfortunately, so I appreciate any info you can give  
>>>> here.
>>>>
>>> No, you are right. I wasn't aware that NRPE could use SSL. In that
>>> case, NRPE would be pretty much the same in terms of performance  
>>> as SSL.
>>>
>>> That said, I am generally concerned from a security standpoint about
>>> any kind of active checks going over the Internet. This is because  
>>> if
>>> you are monitoring, in your example, 200 hosts, you have to poke
>>> holes into 200 firewalls (or into one firewall, and then set up SSL
>>> or SSH keys on 200 hosts). That's 200 potential security holes all
>>> over the place with little or no control, and on machines that may
>>> not necessarily be hardened for access from the outside world. Worse
>>> - active checks, by nature, cause a program to be launched and
>>> executed on the monitored client, and usually with very high
>>> permissions. You said that you check 2000 services, so that's 2000
>>> plugins (give or take a few). What if a hacker found a way to
>>> compromise one of your 2000 plugins? You'd have a privilege
>>> escalation issue along with remote-launch capability. On 200  
>>> clients.
>>>
>>
>> Very high permissions are normally not needed.
> Depends on the plugin, but I'm not sure that this is generally true.  
> For
> instance, something as simple as log file analysis either requires  
> root
> permission on Linux; log files aren't readable by anybody else, or it
> requires that you relax file permissions or security somewhere else.  
> On
> Windows, I'm running my monitoring agent (by default) as the Local
> System account (most Windows services do that anyway). That has
> basically full access to everything, but nothing on the network.

My nagios user only checks basic system stuff, and I haven't run into  
a permission error situation yet, and I check the following by default  
- load, users, disk, swap, memory, processes, databases, raid.

>
>
> Of course check_ping, check_tcp etc. don't usually need such high
> permissions.
>> I prefer using NRPE because
>> of two reasons:
>> 1. It provides a rather simple way of specifying exactly which  
>> commands
>>  can be run, and with which arguments (don't enable argument parsing
>>  in nrpe if the receiving end isn't duly protected by firewalls etc)
>> 2. If someone breaks into the Nagios server, he or she does not get  
>> the
>>  public keys required for running commands on the remote servers.
> Can you explain that second statement? I'm not sure I follow what you
> are trying to say here. Why would getting public keys be a bad thing?
> They are, by definition, freely available anyway.

What you CAN do, though it's kind of a p.i.t.a is, is have a key per  
command. So if you have something like check_disk, you can put a  
single key for just that command. On all the servers you roll this out  
to, you can secure it up by doing something like this.

from="172.16.X.X", command="./nagios-plugins/check_disk / -w 10 -c 5"  
SSHKEY HERE

This way if your server does get hacked, the only thing they can do  
with that key is find out how much disk space you have, no matter what  
other command they throw with it.

When you want to add another check, generate another key, and put that  
into your authorized_keys as well, with forced commands.

Note, this is kind of a pain and if you think you are going to use the  
masterControl feature, it will not work very happily. But if this is a  
small setup, say under 50 servers, it's not totally horrible to roll  
this out. I did this with 200 and once it's up and running your set.

>
>>> Because of these concerns, I am using passive checks almost
>>> exclusively over the Internet (except for publicly available  
>>> services
>>> such as HTTP or SMTP, of course); I wrote an agent that resides on
>>> the client as a wrapper around the excellent NSClient++ and performs
>>> the actual checks. It then forwards the checks to the Nagios server
>>> via NSCA over HTTPS. A second benefit is that this agent collects
>>> about 40 or so check results, and then sends all of them at once
>>> through a single SSL connection. That reduces the overhead of
>>> establishing a secure connection by a factor of 40. BTW, the agent  
>>> is
>>> available as Open Source. Go to http://www.tntmonitoring.com .
>>
>> Sounds like a rather neat solution, although I suppose it has to be
>> configured in both ends before it's actually useful (although all  
>> other
>> agents require some configuration anyways, so perhaps it's not such  
>> a big
>> deal). I'm not too fond of relinquishing the re-check logic of Nagios
>> though, but I guess you can't get everything.
> True, you do lose the recheck logic, and you also lose event handlers
> and probably some other things I'm not thinking of. Actually, that's a
> good point - adding some of these things might be a possible future
> improvement.
>
> As far as the configuration on both ends goes, yes, of course. That's
> probably true for all Nagios checks regardless of what you do. What  
> you
> need server-side is:
>
> - A php page that actually accepts the checks and injects it into  
> Nagios
> (downloadable from the Web site). It is the equivalent of configuring
> NSCA for regular passive checks.
> - An SSL certificate. This seems to be the trickiest part; I was  
> working
> with somebody else a couple of days ago who just couldn't get that to
> work for a long time. The monitoring client clearly was working, but
> couldn't connect to the server because of a self-signed certificate.
> - And of course you need to add the host and passive checks to Nagios;
> no way around that! I have all the service definitions in hostgroups
> rather than individual hosts, so adding the host is as simple as  
> making
> it a member of the appropriate host groups.
>
> -- 
> Kevin Keane
> Owner
> The NetTech
> Find the Uncommon: Expert Solutions for a Network You Never Have to  
> Think About
>
> Office: 866-642-7116
> http://www.4nettech.com
>
> This e-mail and attachments, if any, may contain confidential and/or  
> proprietary information. Please be advised that the unauthorized use  
> or disclosure of the information is strictly prohibited. The  
> information herein is intended only for use by the intended  
> recipient(s) named above. If you have received this transmission in  
> error, please notify the sender immediately and permanently delete  
> the e-mail and any copies, printouts or attachments thereof.
>
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when  
> reporting any issue.
> ::: Messages without supporting info will risk being sent to /dev/null


------------------------------------------------------------------------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list