<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> <html> <head> <meta content="text/html;charset=ISO-8859-1" http-equiv="Content-Type"> </head> <body bgcolor="#ffffff" text="#000099"> Charlie Reddington wrote: <blockquote cite="mid:A981D0AC-0B6C-4954-894F-11AA339DC737@gmail.com" type="cite"> <pre wrap="">On Mar 25, 2009, at 2:30 PM, RijilV wrote: </pre> <blockquote type="cite"> <pre wrap="">2009/3/24 Christopher McAtackney <a class="moz-txt-link-rfc2396E" href="mailto:cristoir@gmail.com"><cristoir@gmail.com></a>: </pre> <blockquote type="cite"> <pre wrap="">Hi all, I was wondering if someone could give a brief overview of the pros / cons of using NRPE to monitor my remote hosts versus using the check_by_ssh command? I'm aware that check_by_ssh increases the CPU overhead, but I'm not clear on the level of impact here - does this increase the load on the monitoring machine in direction relation to the number of hosts being monitored? For example, if I was using check_by_ssh to monitor, say, 2000 services spread across 200 hosts, would I experience significant slowdown on my monitoring machine? Cheers for any info, Chris </pre> </blockquote> <pre wrap=""> SSH is going to slow it down on both sides of the communication. SSH does quite a bit more in terms of setting up the connection which involves using asymmetric encryption to setup a shared secret for symmetric encryption and verifying keys for the asymmetric part, verifying access, allocating a session. Whereas NRPE even with encryption just does a simple pre-shared secret for the symmetric encryption, much faster even if using the same encryption algorithm One thing you could do with SSH to speed it up (and I would argue make it faster than NRPE depending on the stability of your network)) would be to use ControlMaster. ControlMaster is a SSH v2 feature, where you create a connection and can open up multiple sessions with that ControlMaster for other SSH processes. This saves you not only the key-exchange heavy lifting but also you're not opening up a new socket on the remote host. In order to really make it worth it you'd have to spawn a process that was continuously connected. I wrote an ugly check_by_ssh that would spawn a ControlMaster if one didn't exist and use it if it did. Reduced the load/latency quite a bit for SSH checks. Though if I had to do it again I'd used 'ControlMaster auto' (man 5 ssh_config) and create a separate check that was responsible for maintaining the ControlMaster, then you could use the stock check_by_ssh without any modifications. That all being said, you might want to think about a distributed setup anyhow, if nothing more for redundancy. 200 servers and 2,000 checks is alot of responsibility for a singleton, you could break it 50/50 between two servers that could take over for the other one if it fails. .r' ------------------------------------------------------------------------------ _______________________________________________ Nagios-users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Nagios-users@lists.sourceforge.net">Nagios-users@lists.sourceforge.net</a> <a class="moz-txt-link-freetext" href="https://lists.sourceforge.net/lists/listinfo/nagios-users">https://lists.sourceforge.net/lists/listinfo/nagios-users</a> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null </pre> </blockquote> <pre wrap=""> +1 on the control master. We have about 1000 checks over 300 hosts and using control master made the box much more stable and quite frankly usable. Saved a lot of plug in time outs as well. Think about 1000 checks every 5 or 10 minutes. That's 1000 encrypted tunnels that are going up and down. That's a lot of overhead for a quick check, let along if your server is checking say 5 or 10 things back to back. <a class="moz-txt-link-freetext" href="http://www.torchbox.com/blog/ssh_tips_2.html">http://www.torchbox.com/blog/ssh_tips_2.html</a> Charlie ------------------------------------------------------------------------------ _______________________________________________ Nagios-users mailing list <a class="moz-txt-link-abbreviated" href="mailto:Nagios-users@lists.sourceforge.net">Nagios-users@lists.sourceforge.net</a> <a class="moz-txt-link-freetext" href="https://lists.sourceforge.net/lists/listinfo/nagios-users">https://lists.sourceforge.net/lists/listinfo/nagios-users</a> ::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. ::: Messages without supporting info will risk being sent to /dev/null </pre> </blockquote> FWIW: I use both. I have about 400 internal servers that are considered to be "trusted". I have another 50 or so that are outside our network (DMZ'd) and untrusted. To keep overhead low, I use NRPE on the internal hosts and check_by_ssh for the externals. Internally, using NRPE gives me greater flexibility in adjusting client thresholds (mounts to watch, varying memory ranges depending on how much is installed, etc). check_by_ssh gives me a secured, authenticated way of checking system externally (basic sshd_config setup to restrict ssh from nagios user and specific IP's only). I'm unwilling to use NRPE on an external, untrusted server, but don't want the overhead of encryption for internal, trusted systems...<br> <pre class="moz-signature" cols="72"> A. Davis Email: <a class="moz-txt-link-abbreviated" href="mailto:nccomp@gmail.com">nccomp@gmail.com</a> "There is no limit to what a man can accomplish if he doesn't care who gets the credit." - Ronald Reagan</pre> </body> </html>