check_by_ssh question

Paul L. Allen pla at softflare.com
Fri Mar 26 16:45:23 CET 2004


Andreas Ericsson writes: 

> Paul L. Allen wrote:
>> Andreas Ericsson writes: 
>> 
>> I haven't seen anything that doesn't have local exploits.
> I have. It's called Owl, and you can find it at www.openwall.com/Owl

You can add TeX to that list.  The number of commonly-used server
applications that have never had a local or remote exploit is zero.
And even Owl could have a Ken Thompson back-door in it unless all
work done on it was on machines that were never connected to the
Internet before Owl replaced the standard password system on those
machines, and you'd never know unless you looked at a disassembler
listing. 

> And most patches come out because a problem has been detected.

Which is why I said all you can hope is that you're not one of the
unlucky ones that gets hit before there's a patch.  You can vet the
sources of critical stuff, but even then you're likely to miss
something.  You can detect obvious stuff with automated tools, you
can detect less obvious stuff with human inspection, but you can't
detect stuff that you never imagined could be exploited that way. 

Everyone thought that parellizing RSA cracking didn't decrease the
cost - 10 computers get the result 10 times as fast as 1 computer but
cost 10 times as much.  Recently Bernstein came up with a factoring
algorithm that means parallelizing decreases the cost.  You can be
sure that the NSA came up with it before Bernstein, probably not long
before they agreed to allow the export of products using 1024-bit keys. 

I'm willing to bet that there is at least one mechanism you're absolutely
sure cannot be exploited that some day somebody will find an exploit for,
just as everyone used to be sure that parallelizing RSA factoring didn't
decrease the cost. 

>> Your proposed nagios exploit relies upon far more serious exploits in
>> order to be able to use it in the first place.
>
> Not really, no. A local user can probably find something to elevate his 
> privileges with.

We don't have any local users on our servers who don't already know the
root password.  I wouldn't dream of having a server that is also a
general box for ordinary users to play on.  Even apart from the security
issues, I don't want an ordinary user hogging the CPU or doing something
else that degrades the service. 

> A couple of months ago, a remote exploit was found for 
> apache 1.3.28, and today there was a new issue with 2.0.48 (3 different, 
> actually), which could allow remote users to gain shell access.

There are always remote exploits.  The only way I can be sure apache
will never again be exploited is to turn it off.  That's not a feasible
option.  So I have to live with the risk that somebody will find a new
exploit and attack us before the exploit is patched against. 

>> Your attitude is "Somebody could use an exploit to get root on your
>> monitoring machine, get shell as nagios on the monitored machine and
>> then use the same exploit to get root there, so don't use keys with
>> nagios."
> Not at all. Now you're being downright silly. What I meant was that IF 
> your nagios server gets compromised, the problem _WILL_ spread to every 
> machine you're monitoring with check_by_ssh (unless the attacker is a real 
> git, but that's not to hope for).

You're right that the monitored servers *could* be compromised that
way.  But with no "ordinary" local users then the monitoring server
could only be hacked using a remote exploit which is very likely to
be usable on all the monitored servers too. 

As with not running apache at all, there is a trade-off between
vulnerability and utility.  I moved away from check_by_ssh because I
couldn't get it to tunnel through firewalls, not because I was worried
about a key exploit.  In my situation if somebody is in a position to use
the key exploit it's only because they already have a better and easier
remote exploit in the first place.  So if I ever have to use check_by_ssh
for any reason I'm not going to worry about the potential key exploit too
much.  If there's a way to fix the problem I'll apply it, if not then
it's another daemon I have to risk running, just like I have to risk
running Apache. 

>> There will always be local privilege escalation problems.
>> Not necessarily (unless physical access applies).

Again, I repeat to you that all of our local users know the root password
because the only local users are admins who need to know the root password.
The only way somebody can use a local exploit is if they already have
a remote exploit unless one of our local users sets a bad password, which
is a sackable offence.  Most likely a remote exploit through ssh, imap or
apache (we've had all three exploited over the years) and anything they
could then exploit locally as nagios could also be done by the apache
user or whichever other daemon user they got in as that has a shell. 

>> All you can
>> do is fix them as soon as possible.  The point is that if you have a
>> local exploit that you leave unfixed then you have far more to worry
>> about than a nagios exploit. 
>> 
> Indeed, but that doesn't excuse making it easier for an attacker.

If they can get to nagios on our machines they already have a remote
exploit that is easier for them to use than messing around with the
nagios exploit.  A few machines might not be vulnerable but most of them
run the same mix of stuff. 

>> Those people may choose bad pasawords
> Not necessarily. Ever heard of password policy enforcing?

Yeah, VMS had it back in the 80s. 

>> but we make damned sure root has
>> a strong password and administrators can only get onto the machine 
>> locally
>> or by ssh
> Or by exploiting a remote hole in apache or ssh (remember 'The Matrix Bug' 
> ?)

I remember slapper as well.  But again, if either ssh or apache has a new
exploit and we're unlucky enough to be amongst the first hit then they
don't need the nagios exploit. 

>> and therefore all our monitored machines are
>> likely to have the same vulnerability so the nagios exploit is not 
>> needed.
> So that's an excuse to lead him by the nose to the vulnerable servers?

How can I stop him knowing what the vulnerable servers are when all
he needs to do once he has root on the monitoring box is to scan through
the nagios configs?  Most of the monitored servers run the same mix of
stuff as the monitoring server so if he can get onto the monitoring
box he can get onto most of the others the same way. 

> You DO have a funny way of thinking.

No, a practical way of thinking.  There's no point in installing an
expensive lock on your front door if your back door has no lock at all. 

> And besides, do you honestly run the same 
> type of services on your backup server as you do on your 
> cvs/ftp/web/dns/radius/ldap/samba/whatever server? I think not.

Most of the monitored machines run the same mix of services as the
monitoring machine.  There are a few that don't, so depending on
which service gets remotely-exploited a few of them would be vulnerable
because of the nagios exploit whereas otherwise they wouldn't be (if I
still used check_by_ssh).  Realistically, if all the ones vulnerable to a
remote exploit other than the nagios key exploit were taken out the
business wouldn't survive.  Contract cancellations because we couldn't
get them all cleaned up quickly enough would be enough to kill us, then
there would be penalty clauses... 

>> There have to be compromises between security and utility.  I can make
>> any machine 100% safe from network attacks by unplugging it from the
>> network. But I'd be interested in seeing your kernel patch to allow
>> the check_raid plugin function without sudo.
> I can imagine this script / program working in two ways which won't 
> require sudo to make it run;

Sorry, I had a thinko there.  I meant check_smart. 

>> The only way somebody can do that to us is if they use a remote exploit 
>> to get into the monitoring machine.  If they can do that then they can 
>> use the SAME remote exploit to get into the monitored machines.
> Discussed above. Iteration seems to have become an issue here.

That's because I know our mix of services.  The percentage of machines
we could protect by not permitting the key exploit is so small as to
not be worth worrying over.  So if I had to switch back from NCSA to
check_by_ssh, I wouldn't be worried about the security implications
because anybody who could exploit it already has the capability to knock
out most of our machines without it. 

> Except ofcourse browse the remote machines by using the authorized_keys 
> method described earlier.

Generally not a problem for us.  A problem on a couple of machines,
but they're not monitored anyway. 

>> There are always local exploits.  Which is one reason why none of our
>> network machines have ordinary users who do not also know the root
>> password.  We don't mix the concepts of "server" and "user machine" and
>> I wouldn't trust anyone who did.
> The setsuid binaries needed for the shadow password scheme is why you 
> should change to TCB. Then you'd need an overflow to get to read other 
> users password files but you still wouldn't have a root account. Combined 
> with the password policy enforcement it is not computationally feasible to 
> even try to attack a host with those rules in place.

But that doesn't protect you from all the other possible local exploits.
Far safer not to have ordinary users who don't know the root password on
servers.  In some ways, safer still not to have any local users because
the only reason for connecting to those boxes in the first place is to
do something as root. 

>> I already mentioned that.  First, of course, they have to know that the
>> other customer exists and what we've called their servers in Nagios. 
>> 
> So this would be the first line of exploitation. If the nsca daemon has 
> any bugs, then anybody hacking any of your customers would have something 
> to work with. I'm not saying they'll have any luck, just that they know 
> where to start

Yeah, NCSA worries me.  So does any daemon that accepts incoming
connections.  I wish I could turn them all off, but then we wouldn't
have a business. 

> (and that I for one would try for the fun of it).

Whereas I would not.  Not only because of the legal penalties but also
because I have a degree of respect for others.  Crackers would do it
for the hell of it, I wouldn't expect any sysadmin with a sense of
ethics to.  And I wouldn't trust any software with a team member who
said he'd try hacking into somebody else's machine "for the fun of it",
let alone security software.  I think I'll give owl a miss... 

>> OK, so somebody at customer1 gets the root password for their local
>> nagios machine sending passive service checks to us.  That means they've
>> hijacked a machine we supply and control and can do more damage than
>> submitting fake checks.  Then they somehow guess what other customers
>> we monitor and what we've called their machines.
> Not necessarily. Generating looooong strings containing arbitrary garbage 
> isn't really all that hard, but I guess you know your way around the code 
> in libmcrypt and nsca, so you're already certain there aren't any problems 
> there.

I worry about buffer overflows in everything. 

But again you're talking about needing a remote exploit first in order
to be able to use a local exploit in order to be able to use the key
exploit.  The monitoring server is slightly more vulnerable than the
others because only it runs the NCSA daemon.  But since so few machines
run an NCSA daemon compared to machines that run apache, bind, etc.,
black-hats are less likely to spend time on it.  There is enough source
code in the common services to look through for a cracker to bother
spending time looking at NCSA which will probably get him nothing more
than root on the  monitoring server, if that.  Somebody exploiting NCSA
might get root on the monitoring box (which can be repaired relatively
quickly) whereas somebody exploiting Apache, ssh, bind, etc., etc., could
screw most of our boxes at the same time. 

In an ideal world I'd have the time to thoroughly scrutinize NCSA
and libmcrypt on the off-chance that I might be able to spot a
vulnerability that the developers did not.  The chances of me finding
such would be small and the time needed to do so would be a lot more
than the time needed to rebuilt the monitoring box.  Some organizations
have the need to vet things that thoroughly and the budget to do it.
Others have to live with the chance of having to do a rebuild.  Ideally
every box we supply to clients would come with 24x7 on-site support so
that we don't have to open up sshd on it.  Realistically, that would
put the price beyond their means so I have to live with the risk of
an ssh exploit so we can get onto the box remotely. 

> Give me a break. Anybody with a minimum of programming skill can write a 
> program to relay tcp traffic through more hops than than you have hairs on 
> your head. I've written one myself and it took me about 45 minutes.

So have many crackers. :(  But where somebody exploits through physical
access there may well be signs left behind, which is enough to deter
most people from trying it.  If any of our clients had those sorts of
skills they wouldn't come to us in the first place.  Yes, there's still
a risk that some super cracker might get a job there, but it's one we
have to live with.  If our clients consisted of nothing but major banks
and the like we could be sure that unauthorized physical access was not
a problem. 

> Laws are good sometimes, but only if it's within the jurisdiction of 
> whatever bureau you have enforcing it.

UK law makes it a criminal offence as long as one end (attacker or
attacked machine) is in the UK.  Most countries have extradition laws. 

>> I suppose you've gone through the source of the entire kernel and every
>> utility and application on your machines.
> Actually, I'm an Owl developer which means that's exactly what I do.

So why do exploits keep happening?  Most open-source projects have at
least one developer who is strict about enforcing good coding practises
in changes.  If you know all the existing holes in everything, why
haven't you submitted patches?  And why is there a large team of
programmers vetting all open-source sources for this sort of problem?
Did none of them know that you've already vetted EVERYTHING and they
could have just asked you? 

>> I expect you've paid particular
>> attention to apache, bind, postfix/sendmail/qmail, sshd, etc.
> Naturally. Postfix and sshd have been patched rather extensively, and the 
> patches have been submitted to the main branch. Apache is not quite done 
> yet, so atm I'm running a not fully audited version. We have, however, 
> patched glibc to make exploitation harder.

You're not running a fully-audited version.  You never will, because
new versions will appear faster than you can vet all changes to everything
installed unless you're superhuman.  You've vetted a couple of daemons, and
a couple of critical libraries but not everything that could potentially
be exploited.  But at least you started with the ones most likely to be
compromised.  The day you have everything audited and can keep on top of
changes is the day you can corner the market in distros and probably
topple Microsoft too.  Until then you have to worry about local exploits
and remote exploits, just like the rest of us. 

> No, but it's every admins responsibility to check those that are setuid 
> and those that do networking in an 'out-of-jail' fashion. Also, naturally, 
> the libraries to which any program might be linked.

Then there are all the kernel exploits.  You cannot hope to check
everything by yourself and you cannot check for exploits that rely
upon principles you never thought of. 

BTW, did you disassemble gcc?  There is no other way of being sure
that you don't have a Ken Thompson back-door unless ALL owl development
was done on machines that have never connected to the Internet before
the standard password routines were replaced by Owl on those machines.
Recompiling gcc from source, patched or not, does not get rid of a
Ken Thompson back-door.  So if the Ken Thompson back-door ever
contaminated your distro and your machine was connected to the Internet
before you replaced the standard password routines, there is a small
chance that Owl is contaminated because Ken hopped onto your machine
while you were still using the standard password stuff and modified gcc
so it would also insert his back-door into Owl. 

-- 
Paul Allen
Softflare Support 



-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Nagios-users mailing list
Nagios-users at lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nagios-users
::: Please include Nagios version, plugin version (-v) and OS when reporting any issue. 
::: Messages without supporting info will risk being sent to /dev/null





More information about the Users mailing list