CIS 4930.1194X/6930.1078X Spr.'00
Lecture 7 (Jan. 26) Notes:
Semiconductor Technology Scaling Laws

(This lecture actually started with a review of CMOS logic gates, but I have moved the text on that to the lecture 6 file, because it fits there better logically.)

In the previous lecture, we reviewed, for the benefit of those students without so much EE background, the structure and function of the dominant device used in computing today - the MOSFET transistor.  Today we'll carry through a rough first-order analysis of how the electrical characteristics and the performance of that device (and entire chips based on it) scale as you shrink its size.

Looking at the history of semiconductor technology, and at the ITRS roadmap projections for the next 14 years, we see that length scales for circuit elements (such as MOSFET channel length) have decreased by ~12%/year, that is, a shrink factor of ~1.14/year.  (Each year 1/length increases by ~14%.)  Moreover, as a secondary effect, die (chip) diameters have increased by ~2.3%/year.

To make our analysis more concise, we will introduce the following abbreviated notation.  We will say x~y to mean that quantity x scales approximately (to first order) with quantity y.  We will use \/ (this would be a down-arrow if there was one in HTML) to denote a quantity that scales in proportion to channel length (that is, down by ~12%/year as long as present trends hold).  We will use /\ (up-arrow) to denote a quantity that scales as 1/length (up by ~14%/year).  Similarly ^ (small up-arrow) will mean an increase with die size (~2.3%/year), and v (small down-arrow) the corresponding decrease.  We will use 1 to denote a constant quantity that does not scale with technology shrinkage.

The Need for Voltage Scaling

The first major lesson we'll study about scaling semiconductor technology is one cannot scale lengths down alone while leaving voltages constant.  Recall that electric field strength is defined by E = V/L (voltage per unit length).  If lengths decrease ( L ~ \/ ) and voltages stay constant (V ~ 1), then E ~ 1 / \/ ~ /\; electric field strength goes up.  This means that the forces experienced by charged particles (F~E) within the device go up.  As forces increase, eventually something has to break down.  For example, atoms in insulators will become ionized and the insulators will no longer insulate (so circuits will short out).

In addition, in MOSFETs there is a phenomenon called "punch-through" that occurs if voltages are too high relative to channel length.  Suppose there is a high drain-to-source voltage.  If the E field strength in the channel exceeds the built-in E field of the pn junction between the source and the channel, the junction will no longer provide sufficient force to hold charge carriers back, and they will flow from source to drain regardless of whether the gate is high or not.  In other words, the transistor won't be able to turn off in the face of such high source-drain voltages.

For reasons such as this, we must have V ~ \/ (signal voltages decreasing in proportion to length).

In order for a gate signal to be able to turn a transistor on, the threshold voltages Vt have to eventually decrease along with V, and go \/.

Decreasing threshold voltages raises a problem because as Vt approaches the thermal voltage (26 mV at room temperature), electrons will be increasingly able to jump over the threshold barrier thermally, and therefore transistors will be "leaky", unable to turn off thoroughly.  If leakage gets high enough, there is not enough discrepancy between "on" and "off" currents through the transistor to enable it to reliably distinguish between 1 and 0 inputs, and the logical functionality of the circuit will be impaired.

So, eventually, we must have T ~ V ~ \/ as well.  (Decreasing temperature.)  This has a lot of effects on devices though - such as eventually leading to superconducting behavior in some materials - so this case must be analyzed separately.  For now, we will ignore it.

And because voltages are decreasing, we need to thin the gate oxides as well, in order to keep enough charge in the channel to keep performance improving as rapidly as possible.  In general, it's really not just channel lengths, but every length, width, and height of every element of the circuit that needs to decrease, in order to keep everything proportioned correctly.  (The width of the whole die is another matter.)

Scaling of Electrical Properties of Structures

Let's look at how various electrical properties of structures - such as their resistance and capacitance - scale as we decrease lengths.

Resistance R along a wire of length l, width w, and thickness t scales as R ~ l/wt ~ \/ / \/ \/ ~ /\.  Resistances scale up with length. For a cross-chip wire, l does not go \/ but rather ^, so R ~ ^ / \/ \/ ~ /\ /\ ^. Of course, designers tend to use thicker wires to go cross-chip, to alleviate this problem, but note that you cannot have as many thick wires as thin ones. So most low-resistance wires have to stay short.

Capacitance C between parallel plates of width w, length l, and separation s scales as C ~ wl/s.  All the lengths go \/, so C ~ \/ \/ / \/ ~ \/.  Capacitance scales down with the L scale.  Capacitance per unit length for a long wire goes as 1 (constant). Capacitance for a cross-chip wire goes ^. Capacitance per unit area goes as /\.

Charges and Currents

The change in charge associated with charging a given circuit element up by voltage V is given by Q ~ CV ~ \/ \/. It makes sense that it goes with the square of length because excess charge is a surface phenomenon so should scale as area A ~ L2 ~ \/ \/ in order for the surface charge density to stay constant.

We already said that electric field strength E ~ V/L is roughly constant (in transistors), and current density J ~ E/p (where p is the constant resistivity of a given material), so current densities in conductors also remain constant. (This is good, because if they were increasing it would cause reliability problems; an intense flow of electrons would cause rapid "electromigration" of atoms out of their original positions and transistors would wear out over time.)

Since wires are getting narrower and thinner as \/, their cross-sectional area A goes as \/ \/, so total current through a given wire goes as \/ \/.

In turned-on transistors, current flow is restricted to the surface of the channel (just below the gate, in the region where charge carriers have been attracted by the field from the gate), and total current I ~ Q/t, the charge in the channel divided by the time for it to cross the channel. We already know that Q ~ \/ \/. As for t, it can be expressed as l/v where v is the velocity of the charge carriers, and l is the channel length. Unfortunately v in today's devices cannot increase any more, because the charge carriers are already moving at the maximum "saturation" velocity that is possible in silicon (at normal temperatures), namely about 0.1 micron per picosecond (200,000 mph, or about 1/3 of 1/1000th of the speed of light). So t ~ l/v ~ \/ / 1 ~ \/; transition times are going down (but not as quickly as they would without velocity saturation).

So the upshot is that on-current I ~ Q/t ~ \/ \/ / \/ ~ \/. That is, the current per transistor is going down with the length scale.

The effective on-resistance R ~ V/I of transistors thus goes as \/ / \/ ~ 1; i.e. roughly stays constant. For your reference, the typical on-resistance of a minimum-size MOSFET transistor tends to hover in the neighborhood of 10 kohms.

Scaling of Performance Characteristics

Those of you who have had some electrical engineering know that there is a characteristic delay associated with an element having resistance R and capacitance C of t ~ RC.

(There is another kind of delay related to inductance and resitance, scaling as L/R, but since resistances in transistors are fairly high, this is not a significant source of delay in digital circuits.)

For a given piece of wire, C ~ \/, R ~ /\, so RC ~ 1. RC delays for wires therefore remain roughly constant over time.

Note, however, for a cross-chip minimum-width wire, that RC ~ /\ /\ ^ ^, up by about 36% per year. If you want a cross-chip wire's delay to stay constant, you can't keep making it thinner. So the number of cross-chip wires with some given delay that you can have across a chip essentially doesn't improve over time. (Except through non-scaling-related improvements, such as using better conductors such as Cu instead of Al for the wires.)

In transistors, the situation is different, as we saw above. Rather than decreasing, R stays roughly constant (though it is still much higher than in a short wire), and C ~ \/, so the RC delay for transistors is \/.

Another way to see the same thing is to look at the charge Q on the load the transistor is charging, and the current I through the transistor, and the delay is just t ~ Q/I. We have Q ~ \/ \/, I ~ \/, so t ~ \/. Gate delays are decreasing as \/.

For clocking a small, locally-connected circuit, clock frequencies f ~ 1/t ~ /\. Frequency goes up by about 14%/year.

Looking at the performance of an entire chip, the number of transistors per area goes as /\ /\, and the frequency goes as /\, and the area of the whole chip goes as ^ ^, so the total raw performance, in terms of transistor-operations per chip per unit time, goes as /\ /\ /\ ^ ^, or 3 factors of 1.14 times 2 factors of 1.023, or up by 55%/year, or just about doubling about 18 months, in line with the Moore's Law prediction.

This is (within the margin of error of everything we have done) the same rate at which the performance ratings of microprocessors have been increasing in recent years (58%/year, see the intro to Hennessy & Patterson), which indicates that computer architects have in recent years been successfully finding ways to harness all of the new raw power that is becoming available. (Note, however, that processor performance is not improving significantly faster than is raw performance. The architects need to thank the semiconductor people for enabling the majority of their recent gains.)

Increasing Dominance of Interconnect Delay

Note that since the RC delay through the wires in a given structure are (to first order), not decreasing, we will find that over time, a smaller and smaller fraction of the wires in a structure will be short enough to keep up with the increasing frequencies.

In fact, if we let x represent the relative length of a wire whose RC scales with gate delay, expressed as a multiple of channel length, we find that x decreases as the square root of a \/.

So for a two-dimensional circuit, the total number of circuit elements reachable by wire within 1 gate delay scales down as \/. (In a 3-D circuit, it would scale as \/3/2, that is, faster than in the 2-D case, but the total number of reachable elements would start out higher.)

This emphasisizes the increasing importance that mesh-style architectures, emphasizing local interconnections, will come to have over time. Smaller and smaller numbers of transistors will be able to be tied together into a single structure whose design can ignore the communication delays due to the wires between the transistors. Communication time will become a critical factor in the design of all but the simplest structures.

You can try to argue that you can alleviate this trend by using lower-resistance "fat" wires to cover long distances. But in a detailed analysis, you can show that the number "of fat, fast, long wires" capable of reaching relatively long distances within 1 clock cycle also scales as \/, because they must be increasingly fat compared to their length to keep their RC decreasing in proportion to gate delays, so their cross-sectional area per length scales as /\, and even wiring outwards in 3 dimensions there isn't room for more than \/ of such wires to extend from a given core circuit.

Besides which, no matter how fat they are, there is also the speed-of-light delay to contend with. As clock cycles shrink in proportion to lengths, you can't reach more than a constant number of devices even with signals traveling at the speed of light. 

Energy and Power

As we'll see in class Friday (and see p. 150 of my manuscript), the energy stored in a circuit, and the energy dissipated per cycle of operation in (ordinary irreversible) circuits, scales as CV2. Per unit area, C ~ /\, so the per-area CV2 ~ /\ \/2 ~ \/. Energy density (per volume) is constant, a good sanity check on our scaling analysis. (Increasing energy densities would imply increasing pressures, temperatures, and therefore the melting of our machine.)

Clock frequencies, however, are going /\, and so per-area power P=Ef ~ \/ /\ ~ 1. So the power dissipation per unit area (heat flux) remains roughly constant as the technology develops. This is a good thing, because if it were increasing, this would increase the demands on power supplies and cooling systems significantly, and these technologies are already a significant limiting factor on computing performance. (And they pretty much always have been.)

Chip area is increasing as ^^, so total power per die increases as ^^ (about 5%/year) as well.

Note that if you want to build circuits in 3 dimensions, the number of layers you could fit within a surface of given area goes as /\, and so power requirements per unit (outer surface) area would go as /\. This is one big reason why we don't build in 3 dimensions much today.

Later in the course, we'll see that in order to really take advantage of the third dimension to a significant extent, we'll require some form of reversible computing.
 

Addendum: Interconnect problem even worse than described.

This discussion should be merged into the "Increasing dominance of interconnect delay" section above.

Earlier we noted that power per area P/A ~ 1 is roughly constant.  Yet, voltage is decreasing V ~ \/.  What then about current per area, or current density?  P=IV, so I=P/V, so I/A ~ /\.  Earlier, we said that current density was constant because of constant electric fields.  Why the discrepancy?

Earlier, we were talking about characteristics of current flow through a transistor structure as it is charging or discharging a circuit node.  But, keep in mind that as frequencies increase, any given circuit node will be charged & discharged more often.  So, given the constant surface charge density Q/A which we derived earlier, the more frequent charging and discharging means an increasing current per unit area.  We noted earlier that the current per transistor scales as \/, so since the area per transistor goes as \/ \/, the current per area or current density goes /\.

The reason this isn't a problem for transistors is that they are already an extreme case: charge is shoved up into a thin sheet near the gate electrode (to which it is attracted) and so moves at a high current density to begin with, though in a direction parallel to the surface.  The increase in average current density simply means giving this charge sheet less room to expand out when it finishes going through the channel.

It's also not much of a problem for short wires, because metal has a much lower resistivity than doped silicon, and therefore has no trouble carrying whatever current is passed through the transistor even when electric fields are much lower.

The real problem is in sending signals longer distances.  Since the resistivity of metal is a constant, increasing J ~ /\ means that the electric field E ~ /\ along a wire must increase.  But meanwhile, the maximum voltage differences that are enountered scale as \/, and for long wires (whose resistance we noted earlier scales /\), the wires will be carrying singificant fractions of our voltage drops.  But if the voltages themselves are dropping as \/, this means that the length L of a wire must scale as V/E or \/ / /\  ~ \/ \/.  The wire length as a multiple of L must go \/!!  In other words, no long wires to our devices!  If the length of the wire does not scale down faster than transistor lengths, wire delays will come to dominate.

In other words, in the long run, the number of devices in a group of transistors that can communicate within 1 gate delay is going to shrink, and so architectures will need to become more and more highly localized.  This is one of the reasons why it looks like parallel mesh architectures and mesh programming models will be needed in the long run.