All of the topics received positive scores, ranging from 5 (for physics-based theoretical models of computation) up to 26 (for quantum computing), except for one topic: Thermodynamic constraints on computing received a score of negative 11!! I take it that thermodynamics is not a very popular topic around here.
Nevertheless, thermodynamics provides some very fundamental constraints on computing which we cannot ignore. We'll see that it will become especially vital to pay attention to these concerns if the scale of computer components is to come anywhere close to the atomic level. And in fact, we'll see later on in the course that thermal concerns greatly impact the possible efficiency of computers even today. And we will spend several weeks later in the course introducing a concept called reversible computing, which is tailored towards maximizing computational efficiency in the face of thermodynamic constraints.
So, I'm not going to skip or cut short our material on thermodynamics, despite its apparent unpopularity, because if I did, I'd be depriving you of what I think are some of the most important pieces of information you'll get out of this course. But don't worry: the amount and depth of thermodynamics that is required to understand the concepts we'll need is not very great, so hopefully we won't have too much trouble getting through this material.
So, to review, in the past lectures we have surveyed a number of fundamental physical limits on information: How dense it can be packed, how fast it can flow, and how frequently it can be updated.
But, there is another critical physical constraint on information that arises out of some basic properties of physics, which manifest themselves in the laws of thermodynamics. This constraint says, that in a certain important sense, information can never be destroyed.
The basic property of physics that leads to this is something I'll now explain: The reversibility of physics.
Now, given that initial state, let's consider what the state of the system might be at some later time t2. In general, it will be different. What might it be? In the most general terms, we conceive that there is a relation between the state at the earlier time and the state at the later time. We can call this the "transition relation." If you had discrete math, you remember that a relation between two variables can be described a list of all the allowed combinations (s1, s2) of values the variables could have. Or, you can draw it as a directed bipartite graph showing which states at t1 can be paired up with which states at t2.
But now, all of our deepest theories of physics share a special property: they are deterministic. What this means is that this relation between states at t1 and states at t2 is a special kind of relation, namely it is a function f, sometimes called the "transition function from t1 to t2". What this means is that for each state s at t1 there is exactly one state f(s) at t2 that is related to it; or in other words the state at t1 determines what the state will be at t2, thus the word "deterministic."
You'll sometimes hear that quantum mechanics is non-deterministic, but actually this is only true if you look at the wrong level, at the individual states, when instead you should be looking at the wavefunction, since that's the fundamental entity underlying any quantum mechanical system. It's important to emphasize that quantum mechanics is actually perfectly deterministic at the level of the wavefunction. At least, that statement has been shown to be perfectly consistent with all the observed phenomena. It's very interesting to understand how the apparent non-determinism of quantum mechanics naturally arises out of the deterministic evolution of the wavefunction, but I'll reserve that for later discussion, or for the section on quantum computing.
Anyway, why is physics deterministic? One way to see this is by looking back at the fundamental dynamical laws through which physics is described. These laws can all be framed in the form of differential equations which show how physical state variables change over time. For example, if x is a variable representing the complete state of the system (a closed system), then the general form is partial(x)/partial(t) = g(x,t), where g is some function of the current state and perhaps the time.
I have a slide here with an example, which is the Schroedinger equation
of quantum mechanics.
(p. 378 in my manuscript). Don't worry about all the gory details,
just notice that over here on the right is a term giving the rate of change
of the wavefunction with respect to time, and the left side is just a big
expression computing a function (our g) of the wavefunction at the current
time.
Anyway, whatever g is, it unambiguously tells you the system's trajectory going forwards through its state space. If you wanted to calculate the change in state dx over some infinitesimal time interval dt, all you have to do is calculate g(x,t) and multiply by dt. To get the total change in state over some non-infinitesimal time delta t, you just have to integrate g(x(t),t) dt from t to t+delta t. You have to keep track of the fact that the state might be changing over this interval, because that may feed back in to affect the rate of change, but this can be done straightforwardly in a numerical integration. (Evaluating these integrals analytically is more difficult, in fact it's uncomputable in general - nevertheless, the integral is still well-defined.)
Anyway, the point is that the state at t1 completely and unambiguously determines the state at t2.
However, there's something very important to note about all this. There's absolutely nothing in this differential equation picture to prevent these intervals dt and delta t from being negative. That is, there's nothing to prevent you from calculating how the state will change going backwards in time, from t to t-delta t, just as easily as calculating the states going forwards. So, physics is not only deterministic, it's reverse-deterministic, or, deterministic looking backwards in time.
That means, that not only do we have a special kind of transition relation, namely a transition function, but we have a special kind of function, namely an invertible function. What does this mean? It means that for any possible output s2 of the function, there is one and only one input s1 that generates that output (that is, exactly one s1 such that f(s1)=s2). Looking back at the transition graph, that means no two arrows can converge on the same point.
It is this property, the invertibility of the transition function, that is referred to as reversibility. Physics has this property, it is an unavoidable consequence of the fact that physical laws can be described using these time-differential equations.
To sum up, here's another slide. Determinism means that a state trajectory cannot split to go to two different states at some later time; reversibility or reverse-determinism means that state trajectories cannot merge so that two different states converge on the same state at some later time. You put the two together, and you find that state trajectories never intersect at all, although they could potentially twist over and around and under each other in very complicated ways. But anyway, that's the reversibility of physics.
Now, I'm going to show that reversibility can be seen to (sort of) imply the second law of thermodynamics.
Now, the second law of thermodynamics says that the entropy of a closed system cannot decrease over time. We saw earlier that entropy can be considered to be the log of the number of possible states of the system, and can be equated with the information content of the system.
It is a simple consequence of reversibility that entropy cannot decrease over time. Suppose you start out at time t1 with the system having some number N1 of possible states, and you let that system evolve to time t2, at which it has N2 states. Well, if N2 were less than N1, then that would mean that at some point two different states were merging onto one, which is forbidden by reversibility.
Of course the situation is symmetrical with respect to time, so there's a similar argument using the determinism of physics that says that entropy cannot increase over time either. Well, this is absolutely correct, if the "states" we're talking about are the different possible wavefunctions for the system, which is what evolves deterministically in quantum mechanics.
But, waves that are initially concentrated have a tendency to spread out, and so the number of states that have non-zero amplitude increases over time. This is especially true in quantum mechanics because the space of possible wavefunctions is many-dimensional and has many directions the wave can spread out in. So apparent entropy, if measured in terms of the number of discrete states with non-zero amplitude, tends to increase as you progess from an initial situation having only a few high-amplitude states. There's nothing in the differential equations to prevent the waves from converging, but it's just that there are so many ways they can spread out, that a system has to be very highly constrained in order to prevent this spreading and make the waves re-converge. This is what happens in well-isolated systems that exhibit interference effects, and it's what happens in quantum computing.
You could say entropy is decreasing in these situations, but I think it's preferable to define entropy, at this level, as just the log of the number of possible states whose amplitudes you can't re-converge, or in computational terms, as the amount of information that can't be (quote) "uncomputed." We'll see what "uncomputing" means in a little while. Anyway, if you define entropy in this way, as just being the information that you can't ever get rid of, then by definition it can only increase.
Anyway, I don't want to belabor this point too long... Suffice it to say that entropy is information which you can't get rid of; we expect it to exist at the wavefunction level because of the reversibility of quantum mechanics, and we expect it to exist at the level of discrete quantum states because waves tend to spread out in such a way that some parts of the wave spread beyond our reach or control, to cover increasingly many possible states, in such a way that we can never effectively bring them back together.
Now let's see the implications of all of this for computing.
We refer to a slide which is reproduced on p. 44 of my manuscript.
Consider a bit in a computer, in isolation, which holds some value, 0 or 1. In addition to this bit, the total system includes the rest of the computer and its environment, including the vibrational states of its atoms, etc. We presume that regardless of the value of the bit, there is some number N of possible states of the rest of the system, and that the given bit is not correlated with other computational bits. So anyway, the total number of states of the combined system is 2N.
Now, we want to perform a reliable "erasure" operation, which we define as an operation in which all but an astronomically tiny fraction of the possible system states are taken to states where the computational bit is a zero.
Well, unfortunately, because of the reversibility of physics, whatever mechanism the erasure operation uses, it must be one-to-one. So there must still be 2N possible states after the operation. Since there is only 1 possible state of the computational bit (namely 0), there must be twice as many possible states as before (namely 2N) for the rest of the system other than that bit.
Unless this is achieved by moving the bit to some other controlled part of the system, the additional size of the state space must be achieved by increasing the entropy of the uncontrolled, thermal part of system. Since the number of states of the rest of the system has doubled, its entropy has therefore increased by ln 2 nat, or 1 bit.
Note, however, that this argument would not go through if the bit being erased were correlated with some other bit - perhaps one of the bits was produced by copying the other - because then the number of thermal states would not need to decrease; the copy of the bit would still contain the variability that was lost when we erased our bit.
However, in counterpoint to that comment, note that physical locality means that we cannot avoid the entropy generation if the copy is far away in space (too far to access during the course of the erasure operation), because then there would be no way for the information in the copy to have any influence on the detailed mechanics of what happens when we erase the bit to produce the new entropy, so it is as if we had a single isolated bit again - we do not need to consider the remote copy of the bit as a relevant part of the system over the short time interval of the erasure operation.
But in any case, note that this two-bit example is merely suggestive that the entropy generation can be avoided; it does not give a particular mechanism for doing so. Later we will see some particular mechanisms and show that indeed all computing can be performed using such mechanisms, and may therefore in principle avoid all entropy generation that would otherwise be associated with information erasure. (But not without a cost which we will see later.)
So when we erase a bit and add at least 1 bit (k ln 2, where k = Boltzmann's constant = 1 natural log unit of entropy) of entropy to an environment at temperature T, that means we must dissipate at least energy kT ln 2. This was first pointed out by Rolf Landauer in one of your readings for this week.
Actually, not all this energy need be permanently dissipated since we can use our computer as the heat source for a heat engine, and recover some of this energy as work. The amount that is permanently dissipated depends only on the temperature of the heat sink.
The coolest large reservoir in the immediate surroundings of the planet Earth is the radiation field of the cosmic microwave background, at about 3 K. Plugging this in, we get 3x10-23 Joules of energy dissipation per bit erased. If we aren't beaming our heat directly into space with a microwave dish, then we are limited to the atmosphere at about 300 K, for 3x10-21 Joules per bit erased. This may sound small, but it's only about 5 orders of magnitude than what happens in the smallest logic circuits today. If the steady progression of Moore's law continues, it will not be very many more decades before we hit this limit.
[One caveat to this: If we had a nearby black hole, these guys are *very* cold, colder than the cosmic background by many many orders of magnitude, so that a very tiny energy is needed to increase their entropy by a large amount, and so you could theoretically lower your heat energy into the hole in such a way that you recover virtually all of the energy as work, leaving only an extremely tiny fraction of it to enter the hole and increase its surface area (and entropy) by a tiny amount, but enough to hold the entropy you are trying to get rid of. Even better, as you add energy to the hole, it becomes even colder, and a better heat sink! And anyway, we saw from Bekenstein that black holes have maximum entropy for any object of their size and energy. So they are in a sense the ideal "trash compactor" for getting unwanted, garbage entropy out of the way with minimum energy expense.]
Anyway, the upshot of all this is that there are some hard, fundamental limits to energy dissipation for bit erasure unless you can come up with a mechanism that erases the bits in a controlled way using other bits with which they are redudant (correlated). Later in the class we'll see how this can be done.
This concludes our first two weeks on fundamental physical limits. In
Monday's lecture we'll discuss the more near-term concerns in scaling semiconductor
technology.