Lectures 29-31 Homework

Lec. 29 & 30 (Scaling) homework

[3 points] Simple analysis. From lecture 30 (and the corresponding memo, see readings), there is a formula for the advantage ratio between reversible and irreversible energy cost in the Frank02 algorithm. Use analysis and/or graphs of this function of n and k for various values of R_on/off to convince yourself (and the grader) that for given n and R_on/off, the advantage is a concave-down function of k (has only a single maximum), and that values of n greater than 10 are never optimal for values of R less than a trillion.
[5 points] Small Programming Project. Mimic the analysis from the memo analyzing and optimizing the "Frank02" algorithm (see reading list), but for the original (less spacetime-efficient) Bennett-89 algorithm instead. Present your results with a graph and power law, similar to the one in that memo.
[10 points] Small Programming Project. Write a program to calculate the maximum and average instantaneous space usage of the "Frank02" algorithm, as a function of n and k. Assume each triangle (adiabatic op) and each horizontal line (stored state) takes 1 unit of space. Present the results using tables and graphs to show how both quantities scale with respect to n and k. If you can produce a 3-D graph, that would be useful. If possible, use analysis or interpolation from the graph to give an approximate analytical formula for space usage as a function of n and k. Compare the results to the maximum space usage in the original Bennett89 algorithm (which was ~nk), and to the average space usage of Bennett89 (which you will have to calculate).
[20 points] Analysis. In this problem you are going to analyze how the performance advantage of reversible computing scales in area-limited, flux-limited scenarios for arbitrary computations, and determine whether Bennett-89 or Frank-02 is more appropriate for this purpose. This will use results from the previous two problems.

Set up your initial variables: system diameter, entropy flux per unit outer-surface area, entropy coefficient and spontaneous entropy generation rate per-device, entropy generated per irreversible op, minimum time per op (non-adiabatic limit), and volume per device. If you like, you may substitute heat for entropy throughout this problem.
Assume the system is cube-shaped with entropy removal through one face. Now you can express the volume and the max rate of entropy removal as functions of diameter and entropy flux.
For both the Bennett-89 and Frank-02 algorithms, do the following:

Express the average space usage, and the rate of entropy generation per irreversible op emulated, as functions of n, k, and t (time per adiabatic transition). (Use results from exercises 1 & 2 as needed.)
Knowing the average space usage per reversible device emulator, and the volume of the system, you can compute the maximum number of these "emulators" per system.
Now, show how to optimize t for maximum rate of computation within the entropy-rate limit.
Now, show how to optimize n and k given the initial parameters of the problem.

Express your results in terms of overall performance. Determine the regimes of parameter values in which Bennett-89 and Frank-02 are best performers. (Or, if one consistently dominates the other, say so.) If the optimal choice of t, k, n, and choice of algorithm is always made, how rapidly does reversible performance scale with diameter? (Fit your results to a power law.) What is the reversible advantage factor versus irreversible performance as a function of diameter? (Note irreversible performance scales as diameter squared.)
How do the results change if the maximum space usage, rather than the average space usage, is used throughout the problem?

[30 points] Analysis. Like the previous problem, but focusing on overall performance for 3-D parallel flux-limited algorithms instead (as discussed in lec. 30, but for arbitrary computations). You have to set up the whole problem yourself. The goal is to figure out which algorithm (Bennett-89 or Frank-02) scales better in this scenario. You should consider both maximum space usage and average space usage. You should take into account how the space usage impacts the minimum spacing between devices. You should consider both local and non-local communication scenarios. For the solution that minimizes run-time, you should note how much of an increase in hardware cost (mass, space usage) is required, compared to the optimal 3-D irreversible solution, and how the cost overhead scales with the speedup factor. Does reversibility ever confer an advantage in performance per unit hardware in this scenario? I.e., does the speedup from denser packing outweigh the extra cost from doing the emulation? The analysis in sec. 6.3.1 of the green book says yes, for Bennett89 with suitable parameter settings and a negligible leakage rate. But we'd like to see in more detail how this hardware-efficiency benefit scales with technology & algorithm parameters and with the choice of algorithm.

Lec. 31 (Reversible Architectures) homework

[10 points] Small programming project. Create a software simulator for either the original Billiard-Ball Model of reversible computing, or Margolus' BBMCA cellular automaton emulation of it. Include a real-time graphics display, or, at minimum, the ability to output graphical "snapshots" (in pixel or ascii graphics) of the machine state. Encode some interesting reversible gates such as Fredkin or Toffoli gates in terms of billiard-ball interactions, and show that they work. You should use references such as Margolus' Ph.D. thesis (see reading list) to help you with the details.
[5 points] Small circuit design project. Show how to modify the low-level logic gates of FlatTop to fix the problem of non-adiabatic transistor turn-off (described in sec. 7.6.4 of the green book). You may need to add additional signals between gates.
[points TBD] Write & test some reversible algorithms using the PisaHut development environment for the Pendulum ISA. Detiled assignment description to be provided by TA Jeff King.
[100 points] Hefty VLSI project. Fix the bug of non-adiabatic transistor turn-off in FlatTop. For reference, the original circuits can be found at http://www.cise.ufl.edu/research/revcomp/hw/MIT_circuits/ (or /cise/research/revcomp/hw/MIT_circuits on the CISE unix machines).
[10 points] Small Programming Project. Write a reversible implementation of the Quick Sort algorithm using the Pendulum Assembly Language (PAL).

Email your PAL code to jck@acm.cise.ufl.edu. Be sure to include your full name in the email.

Test your programs using pendvm (the Pendulum Virtual Machine), The pendvm binary is available at http://www.cise.ufl.edu/~mpf/pendvm/pendvm, and can be executed on CISE UNIX machines.

Information about the Pendulum Instruction Set Architecture (PISA) is available in Appendix B of Dr. Frank's Ph.D. thesis. Notes about the pendvm tool are available at http://www.cise.ufl.edu/~mpf/pendvm/pendvm-doc/pendvm.ps. Sample PAL files are located in http://www.cise.ufl.edu/~mpf/pendvm/