NRPE vs. check_by_ssh

Andrew Davis nccomp at gmail.com
Wed Mar 25 21:31:53 CET 2009


Charlie Reddington wrote:
> On Mar 25, 2009, at 2:30 PM, RijilV wrote:
>
>   
>> 2009/3/24 Christopher McAtackney <cristoir at gmail.com>:
>>     
>>> Hi all,
>>>
>>> I was wondering if someone could give a brief overview of the pros /
>>> cons of using NRPE to monitor my remote hosts versus using the
>>> check_by_ssh command?
>>>
>>> I'm aware that check_by_ssh increases the CPU overhead, but I'm not
>>> clear on the level of impact here - does this increase the load on  
>>> the
>>> monitoring machine in direction relation to the number of hosts being
>>> monitored? For example, if I was using check_by_ssh to monitor, say,
>>> 2000 services spread across 200 hosts, would I experience significant
>>> slowdown on my monitoring machine?
>>>
>>> Cheers for any info,
>>>
>>> Chris
>>>
>>>       
>> SSH is going to slow it down on both sides of the communication.  SSH
>> does quite a bit more in terms of setting up the connection which
>> involves using asymmetric encryption to setup a shared secret for
>> symmetric encryption and verifying keys for the asymmetric part,
>> verifying access, allocating a session.  Whereas NRPE even with
>> encryption just does a simple pre-shared secret for the symmetric
>> encryption, much faster even if using the same encryption algorithm
>>
>>
>> One thing you could do with SSH to speed it up (and I would argue make
>> it faster than NRPE depending on the stability of your network)) would
>> be to use ControlMaster.  ControlMaster is a SSH v2 feature, where you
>> create a connection and can open up multiple sessions with that
>> ControlMaster for other SSH processes.  This saves you not only the
>> key-exchange heavy lifting but also you're not opening up a new socket
>> on the remote host.  In order to really make it worth it you'd have to
>> spawn a process that was continuously connected.  I wrote an ugly
>> check_by_ssh that would spawn a ControlMaster if one didn't exist and
>> use it if it did.  Reduced the load/latency quite a bit for SSH
>> checks.  Though if I had to do it again I'd used 'ControlMaster auto'
>> (man 5 ssh_config) and create a separate check that was responsible
>> for maintaining the ControlMaster, then you could use the stock
>> check_by_ssh without any modifications.
>>
>>
>> That all being said, you might want to think about a distributed setup
>> anyhow, if nothing more for redundancy.  200 servers and 2,000 checks
>> is alot of responsibility for a singleton, you could break it 50/50
>> between two servers that could take over for the other one if it
>> fails.
>>
>>
>> .r'
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Nagios-users mailing list
>> Nagios-users at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nagios-users
>> ::: Please include Nagios version, plugin version (-v) and OS when  
>> reporting any issue.
>> ::: Messages without supporting info will risk being sent to /dev/null
>>     
>
> +1 on the control master. We have about 1000 checks over 300 hosts and  
> using control master made the box much more stable and quite frankly  
> usable. Saved a lot of plug in time outs as well.
>
> Think about 1000 checks every 5 or 10 minutes. That's 1000 encrypted  
> tunnels that are going up and down. That's a lot of overhead for a  
> quick check, let along if your server is checking say 5 or 10 things  
> back to back.
>
> http://www.torchbox.com/blog/ssh_tips_2.html
>
> Charlie
>
> ------------------------------------------------------------------------------
> _______________________________________________
> Nagios-users mailing list
> Nagios-users at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nagios-users
> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
> ::: Messages without supporting info will risk being sent to /dev/null
>   
FWIW: I use both. I have about 400 internal servers that are considered 
to be "trusted". I have another 50 or so that are outside our network 
(DMZ'd) and untrusted. To keep overhead low, I use NRPE on the internal 
hosts and check_by_ssh for the externals. Internally, using NRPE gives 
me greater flexibility in adjusting client thresholds (mounts to watch, 
varying memory ranges depending on how much is installed, etc). 
check_by_ssh gives me a secured, authenticated way of checking system 
externally (basic sshd_config setup to restrict ssh from nagios user and 
specific IP's only). I'm unwilling to use NRPE on an external, untrusted 
server, but don't want the overhead of encryption for internal, trusted 
systems...

  A. Davis
  Email:     nccomp at gmail.com

  "There is no limit to what a man can accomplish
   if he doesn't care who gets the credit." - Ronald Reagan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.monitoring-lists.org/archive/users/attachments/20090325/6fef7806/attachment.html>
-------------- next part --------------
------------------------------------------------------------------------------
-------------- next part --------------
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null


More information about the Users mailing list