The first primary goal of this thesis will be to continue the lab work to the point where there are publishable results showing that CMM works, and can reliably make a sequence of programmed changes to a DNA sequence as predicted in the simulations discussed in the previous chapter. Related to this will be refining the techniques for isolating molecules that have reached a desired final state. Of course, we cannot guarantee that the experiments will be successful (otherwise there would be no reason to perform them), but even negative results will be useful because they will help to delineate what is not possible in DNA computation. Additionally, continued negative results would indicate that there is something severely wrong with the DNA-binding models in the biochemistry literature; this would be a fact of scientific interest.
Assuming that CMM can be made to function, I will attempt in my thesis work to help run as many interesting machines as possible; these will at least the 1-2 counter and a simple Turing machine such as described earlier; I will try to continuously increase the complexity of machines that we can execute.
The second primary goal of the thesis, to be pursued concurrently with the experiments, will be to further analyze and develop the theory of CMM. This will searching for simpler universal computational systems based on CMM, possibly ones based on a sort of stochastic cellular automaton. I will also attempt to analytically characterize the computational efficiency of programming systems based on CMM. Another important part of the analytical work will be to consider how extensions to CMM might improve its computational efficiency: for example, I have thought a little about how to implement a form of inter-processor communication using restriction enzymes, by allowing strands to temporarily join end-to-end to exchange information (over many cycles) and then become separated again to join with other strands. Restriction enzymes might be useful in other ways as well. Also I will conduct some literature searching to determine if there are other DNA-manipulating enzymes or complexes that might be made to do something useful
Another goal of slightly lower priority will be to consider the issue of the biological plausibility of CMM and variations of it. For example, the method we use to separate strands in preparation for replication is to raise the temperature, but this is not how it happens in vivo; instead, special enzymes open a "replication fork," a region where strands are separated so that new copies can be built. There is some question about whether mutagenic oligos can be made to bind to strands in the replication fork. Literature research needs to be done to gain clues as to whether this is feasible. It may turn out that we can do actual experiments in which we separate strands by some method other than heat; in fact, [Walker-et-al-92] shows how DNA can be repeatedly replicated in vitro at constant temperature. If such experiments turn out to be feasible, we will do them. It may even turn out to be feasible to do experiments where mutagenic oligo mixtures are injected into actual living cells and get incorporated into cellular DNA during cell division, although special chemical modification of the oligos may be necessary to curtail the oligos' degradation by cellular enzymes [Fisher-et-al-93]. If we are so lucky and in vivo application of CMM can be made to work, then we may spend some time thinking about possible medical applications of the technique.
Next, an important but relatively small part of the Ph.D. work will consist of an ongoing survey of the other efforts in the field of DNA-based computation, and their relationship to ours. We are aware of a number of groups that are working in the area that we will be keeping an eye on. We also believe, as mentioned earlier, that there are serious flaws in most of the techniques that have been presented so far; these will be enumerated in the thesis--along with, of course, an analysis of their good points.
Finally, an exercise which I consider to be very important in this work is to maintain an awareness of the research that is being done on other forms of molecular or nanoscale computation, because it may happen that opportunities will arise for useful interactions and cross-fertilization between those fields and DNA computation. For example, Heller and Tullis, studying molecular photonics, have discovered how to attach fluorescent dye molecules to DNA molecules in such a way that an excited electron state tunnels down a chain of annealed DNA strands from one dye molecule to another of a different color. This could potentially lead to a mechanism for reading output from a DNA-based computer. And Seeman is studying how to build large, complex 3-dimensional self-assembling structures out of DNA. Erik Winfree has suggested (at the recent DIMACS workshop on DNA-based computers) that the self-assembly of such a structure might be made to mirror the execution of a cellular automaton and thus perform computations. It is not yet yet clear what the result of combining such ideas with ours might be, but nanotechnology and molecular electronics are definitely areas to keep an eye on as a source of ideas. If there is room in the thesis I will go over the pros and cons of some of the alternative approaches to molecular-scale computing, and estimate their future prospects. It will be important, down the road, to be prepared to work on whatever form molecular computation technology finally wins out and becomes the basis for future generations of computers.