CIS 4930.1194X/6930.1078X Spr.'00
Lecture 2 (Jan. 12) Notes:
Physical Locality and the Speed-of-Light Limit

The Limit Itself

In considering the physical limits that come to bear on computing, perhaps the foremost and most obvious is the speed-of-light limit that comes out of special relativity - namely, no information can travel faster than light. This is also described as the principle of physical locality: an event can immediately only affect or be affected by circumstances in its immediate local vicinity.

First, a bit of history. Newton's original law of gravitation was phrased in terms of a nonlocal force that instantaneously propagated from each object to all others in the universe. Even Newton, however, suspected that this model was only an approximation to the truth, and that eventually influences would be understood to operate locally and propagate at some limited speed.

That speed, for gravity as well as every other force, turned out to be the speed of light, which was first measured I believe in the 19th century. It's about 3x108 meters per second. Maxwell subsequently showed that the electromagnetic waves predicted by his theory of electromagnetism automatically travel at the speed of light, which led to the realization that light is a form of electromagnetic wave.

But there's something odd about the idea that physics specifies a constant speed for something. It seemed at first sight to conflict with the long-standing "principle of relativity" which held in classical mechanics, which stated that Newton's laws of motion remained equally valid whether one was moving (at constant velocity) or standing still. In fact, since the laws are exactly the same in all such reference frames, there is no way to tell which one is truly "standing still"; all motion is only relative.

The apparent paradox is this: If I'm moving forward at velocity v (relative to the Earth, say), and I emit a pulse of light which moves forward (relative to me) at the constant velocity c, then won't it necessarily be moving at a different speed v+c relative to the Earth? Doesn't this mean that Maxwell's law of the velocity of light doesn't hold equally well in all reference frames? But, the trouble was that all experiments seemed to confirm Maxwell's prediction that c really was the same for every frame of reference.

Einstein was of course the first to realize that in fact there was no contradiction, that in fact the principle of relativity and the constancy of the speed of light could be made perfectly compatible, as long as one just dropped certain hitherto-implicit assumptions about the nature of space and time. Namely, the magnitude of the spatial and temporal gap measured between any two events varies depending on which reference frame you are in, in a certain well-defined way. Although the speed of light is indeed constant, as was required by Maxwell's laws, space and time themselves are not. What a radical concept! But it turned out to be absolutely correct, and has since been confirmed beyond a shadow of a doubt.

We don't have time to explain the whole theory of relativity here. But suffice it to say it turns out to have all sorts of amazing consequences. There is of course the equivalence of matter and energy, the understanding of which led to nuclear power and nuclear weapons. There is the effect that observers moving relative to each other will see each other's lengths (in the direction of motion) shrunken, and will see each other's clocks running slow. (While each observer looks perfectly normal to himself.) This effect sounds paradoxical, but actually it's not.

Anyway, the most important consequence of special relativity for our purposes is this: Nothing can travel faster than light. There are several ways to try to explain why.

First, if you try to accelerate a mass to the speed of light, its mass asymptotically approaches infinity as it nears the speed of light, so that even reaching the speed of light would require infinite energy, and accelerating beyond the speed of light would require something like "more than infinite energy," whatever that means.

Second, there's the problem that an object going faster than light would have a length that is an imaginary number - the square root of a negative number - and this doesn't seem to make any sense.

Third, and to me most powerfully, there is a theorem that if you have any general capability of going faster than light, using whatever means, and if the principle of relativity is correct, then necessarily you also have the capability of going backwards in time. That is, if you can go faster than light in one reference frame, then to an observer in another reference frame, you are going backwards in time, and this of results in crazy things such as effects preceding their causes, and paradoxes such as a time traveler going back in time and preventing himself from making the trip in the first place.

Still, the intuition that something (like time travel) is paradoxical is not a rigorous proof; and it's hard to make the proof rigourous since we don't have a complete, unified theory of physics to work from. So, there are physicists who are still actively investigating the possibility that some sort of faster-than-light travel (and/or time travel) might still be possible within physics. Some suggest that perhaps if you go back in time, you also go to an alternate quantum universe, and thereby avoid the paradox of changing what has already happened in your own universe.

But in my view, this is all very unlikely to work out; it seems that every idea for getting around the speed-of-light limit that anyone has come up with has been fairly quickly ruled out for one reason or another.

I included an article describing some of the problems with faster-than-light travel in the readings; if you're interested, you can delve into this further by reading the article and its references. Many articles on this and all areas of physics are available on the web through the server at arXiv.org.

Now, more seriously, let's look at what the speed of light limit means for computing. First, it's important to note that it's not just a limit on material objects: It's a limit on information. No information can be conveyed faster than the speed of light.

In fact, Einstein thought there was a problem with quantum mechanics, because it seemed to him that certain quantum phenomena could only be explained by a faster-than-light "action at a distance." But Einstein was wrong; it has since been proven that the situation he described cannot be used to communicate information faster than light, and moreover, the modern view of quantum mechanics understands it as being a purely local theory, with local interactions only.

The sole type of exception to the speed-of-light limit that seems to be allowed in physics is this: the expansion of the whole universe means that distant parts of it may actually be receding from us faster than light. This does not contradict relativity, because the speed-of-light constraint, like all physical laws, determines local interactions. The law is really that you can't travel faster than light relative to your local surroundings. But for distant parts of the universe, the very space in between us and them is expanding, in accordance with the laws of general relativity, and as a result the distance is increasing faster than light. This does not lead to any paradoxes because those distant parts of the universe are causally disconnected from us: Nothing there can affect us, and vice-versa. So you can't use this sort of "faster-than-light" phenomenon to set up any time travel paradoxes.

In any event, regardless of how things might eventually work out in a complete theory of physics, certainly the speed of light will be absolute constraint on computing technology for all practical purposes for the foreseeable future. We can be certain of this with a very high degree of confidence.

Impact on Computing

What's the implication for computing? At first it might seem that the speed of light is so great that it almost doesn't matter. But this isn't true. Today we are witnessing the development of computer chips with clock speeds of over 1 GHz. This is 109 cycles per second, or 1 cycle per nanosecond. In a nanosecond, light can only travel 30 cm (about 1 foot). So if you want to, for example, get a response from a random-access memory within 1 cycle, none of that memory can be more than 15 cm or 6 in away from the central processor.

If you use electrical signals rather than light for communication, it's even worse, because in practice electrical signals really only propagate at about half the speed of light (depending on the materials), so really none of your memory can be more than 3 in away.

And of course, this is merely accounting for communication time, and ignores whatever time the memory takes to process the request. And of course, this constraint is only going to get worse as technology continues developing along the Moore's law path. If even just one more factor of 10 improvement is possible, then computer cycle times will correspond to a round-trip communication distance of only 7.5 millimeters.

And in fact, we'll see on Friday that there are limits to information density as well, so these two facts together imply a direct limit on the size of a 1-cycle-latency random access memory.

So, what's the implication of all this for computer architecture? Simply that if you want to build a computer that's larger than this size limit, most of it will be idle, inaccessible, at any given time, unless you disperse processing elements throughout this space, in addition to memory. That is, for optimum efficiency you are forced to depart from the traditional von Neumann uniprocessor architecture, and instead use a parallel, multiprocessing architecture, where each processor is associated with some local memory.

Not only that, but the speed of light limit ultimately constraints what interconnection topology you can have in a scalable multiprocessor architecture. The number of processing nodes reachable in n hops from any node cannot grow faster than order n3 and still embed the network in 3-D space with a constant time per hop. This is the thrust of Vitanyi's paper in the readings.

The implication is that a lot of existing multiprocessor topologies simply don't scale. Hypercubes, binary trees, etc., all have the problem that communication times start to dominate as the machine sizes are scaled up.

In fact, one can show that there is only one class of network topologies that is asymptotically optimal as the machine size is scaled up: namely, a 3-D mesh structure, where each processor is connected to a constant number of physically-nearby neighbors. The paper by Bilardi and Preparata gets into this in some depth, although they focus on a 2-D version. (Actually with a 3-D mesh there are concerns with heat removal, which we'll get into later. It turns out there are ways around them.)

One way to show the mesh is best is just by observing that with a mesh, you can simulate any alternative architecture with constant-time slowdown, by using the mesh of processing elements to simulate all the elements (processors plus wires) of the alternative architecture.

Another way to see it is this: If you can perform a cycle of useful computation in the same time it takes information to travel a distance d, then if you have your signal go through a processing stage every unit d of distance, it won't retard that signal much (a factor of d only). So you might as well have useful processing elements spread through your machine at fairly constant density. It doesn't add much to the cost of the memory, and provides opportunities for more computation all along the way, wherever a signal might travel, so that you're never wasting that much time just doing dumb communication.

Anyway, today people are building so-called "processor-in-memory" chips where a hunk DRAM and a processor are together on the same chip. This is a great advance, as it allows enormous bandwidth and low latency between the DRAM and the processor. (If we're lucky, we may have a guest lecture later in the semester from an MIT researcher who has developed a super-fast architecture that leverages this effect.) It simply doesn't increase cost very much to combine the two. We can imagine a future generation of computers where, whenever you buy a new SIMM card to add RAM to your machine, it comes with its own embedded processors that can rapidly access that RAM in local fashion, and serve as powerful new multiprocessors for your machine. Of course, operating systems will have to change somewhat to accomodate this architecture, but probably not too much, since they already accomodate multiprocessors without associated RAM.

As processor speeds increase, the speed-of-light limit will cause communication distances to shrink, and the idea of intermeshed processors and memory will become increasingly critical.

We'll talk some more about the implications of physics for computer architecture in part 6 of the course.