Eventually, we wish to demonstrate that replication-clocked mixture mutagenesis is capable of universal computation. We are given a clue about how to do this by the fact that string-rewriting systems are universal. The proof that string rewriting systems are universal works by showing that any Turing machine can be simulated by an appropriate string-rewriting system, and since there exist universal Turing machines, it follows that universal string-rewriting systems exist as well. Thus, if we can simulate any Turing machine via a corresponding oligo mixture, then there exists a universal mixture. We start by showing how to simulate a very simple Turing machine, and we will extrapolate from there to show how to simulate a larger machine.
As a particularly trivial example of a Turing machine, consider one in
which the tape has only two symbols, 0 and 1, and the
tape head has only one state (call it q0). When reading a
0, the tape head moves right, and when reading a 1, the
machine halts. For example,
q0
0 0 0 0 1 Execution of a simple Turing machine.
q0 The q0 above the tape indicates the
0 0 0 0 1 position and state of the tape head.
q0
0 0 0 0 1
q0
0 0 0 0 1
Now, we cannot actually have the tape head be above the tape
symbols encoded on the DNA; rather we encode a special symbol for the
head onto the tape, located just before the symbol the head is
currently "reading;" this is a standard way to encode Turing machine
states in strings. The desired evolution of our system becomes:
0 q0 0 0 0 1 Sequence of strings encoding
0 0 q0 0 0 1 the execution of a Turing machine.
0 0 0 q0 0 1
0 0 0 0 q0 1
In other words, on each step, if there is a zero to the right of the
"tape head" symbol, the two symbols are swapped. This can be
achieved by the single rewrite rule:
q0 0
-----
0 q0
Unfortunately, as we noted at the end of the previous section, it is
difficult to design an oligonucleotide that encodes one and only one
rewrite rule. If the code for 0 is close enough to the code
for q0 so that an oligo encoding 0 q0 can bind to a site
encoding q0 0, then it will also bind to sites encoding
0 0, since these will have fewer mismatches; thus extra copies
of the tape head will be introduced all over the tape, anywhere there
are two zeroes in a row.
One way of approaching this problem is to drop our original
requirement of equal-sized symbols, and have the symbol for q0
be a different size than the symbol for 0, so that the
0 q0 oligo simply will not fit anywhere except overlapping the
old q0 symbol:
--------->
[0][ q0 ]
:X:::::X: : - region of matches
[0][ q0 ][0][0] X - region of mismatches
<---------------
Indeed, we have actually designed an implementation based on this
idea, and verified it in simulation (see this section). However, the need
for overlaps highly constrains the space of possibilities for symbol
encodings, and this technique is therefore difficult to apply to more
complex Turing machines, or to general string-rewriting systems.
Fortunately, there is a more general way of solving this problem, which allows us to design mixtures of oligos that achieve transformations between strings that are entirely specific to the desired source and destination strings, and involve no others.
The key is this: instead of performing the desired transition in a single step, which forces symbol encodings to be so close together that unwanted side reactions can occur, we allow it the transformation to proceed gradually over several steps, each of which changes only some of the bases in the encoding of the source substring to bring it closer to the result substring. In this picture, we can think of symbol encodings as being widely separated points in a high-dimensional metric space in which the distance between points is defined to be the number of mismatches (more or less). For each symbol in a desired rewrite rule, we pick some path through the space between the two points, and ensure that the path does not pass near any other points (symbols) in the space, or intersect the paths for any other rewrite rules. This ensures that our computation cannot wander off track. And, given sufficiently lengthy encodings for the symbols, the space is large enough that we should always be able to find such a path.
[An informal proof of this assertion goes as follows: Suppose there are n different symbols. Then there are n^4 possible different two-place rewrite rules. If the symbol length is m, then each rule will need on the order of m intermediate points along the path (changing a few bases at a time, and assuming we do not have to "back up" too often), for a total of m * n^4 points needed. But the number of available points is on the order of 4^m, since there are 4 bases at each of m positions. However, the effective size of the space is reduced by a factor of something like m * n^2, since we have to ensure that our points are not accidentally complementary to some misaligned pair of symbols (n^2 pairs, m alignments each). However, 4^m/(m*n^2), the effective number of available points, is still asymptotically greater than m * n^4, the number of points we need. Thus, for sufficiently large m, there will be plenty of room for the paths for all our rewrite rules.]
Now, although these paths ensure that the only transitions made between substrings are the desired ones, they do not, unfortunately, guarantee that the transition proceeds in the desired direction. This is because if we have oligos O1 and O2 representing consecutive steps along the path, then O2 will bind to the complement of O1, and so O1 will necessarily bind to the complement of O2. (Actually, this is not really true, because of the asymmetry of energy of complementary mismatches: a C:T mismatch is much more unstable than an A:G mismatch, for example, and in general, purines (A,G) pair with purines more stably than pyrimidines (C,T) pair with pyrimidines. So we can force forward progress as long as we are changing pyrimidines to purines. Eventually, however, we will run out of pyrimidines.)
One might suggest, why not just leave the oligo representing the source substring out of the mixture? Unfortunately, it is not that simple, because the symbols of the source substring might be in the results of other rewrite rules, and thus will be present in the oligo mixture anyway. In fact, this possibility means that whenever some of the symbols produced by a rewrite rule might be used as the source of another rule, then the result oligo that introduces those symbols will in general bind to the template for the first step of that other rule, and so that oligo must be engineered so as to convert all the symbols in the first step of that other rule back to their original form. For example, suppose we were to implement our machine with the following sequence of transitions:
State 1: 0 q0 0 0 Brackets represent oligos
State 2: 0 [L1 R1] 0 incorporated into a new
State 3: 0 [L2 R2] 0 strand. Ln and Rn represent
State 4: 0 [0 q0] 0 base sequences for the
State 5: 0 0 [L1 R1] left and right sides of intermediate
etc. points along the transition
path for q0 0 -> 0 q0.
This would not work, because the result oligo (of state 4) can bind to
the intermediate strand of state 5, taking the L1 back to a
q0 but leaving R1 behind as an artifact, which can lead
to bad effects later.
To solve this problem, the symbols being changed must always be flanked by enough symbols on either side so that if the result of one rule overlaps the source of another, the flanking symbols on the result oligo completely cover the source area for the second rule. For example, for this machine, we would have to use 4-place oligos:
0 q0 0 0 0
[0 L1 L2 0] 0
[0 L2 R2 0] 0
[0 0 q0 0] 0
0 [0 L1 L2 0]
This works because at no point can a bare intermediate symbol be taken
out of context. Unfortunately, this solution raises the complexity of
our rewrite rule implementation somewhat, because we are forced to
create a separate set of oligos for each possible combination of
flanking symbols. In this example, each flanking symbol can be either
0 or 1, so 4 separate sets of oligos are needed.
This machine does work, although it does not ensure forward progress.
Subsequent strands along a line of descendants of the initial strand
will in effect execute a random walk forwards and backwards among the
allowed rewrite-rule transitions. The effect is that the simulated
tape head, instead of moving steadily to the right, will itself
execute a random walk in the region of 0's flanked by 1
symbols on either side. We have succeeded in implementing our Turing
machine, in the sense that we can explore all states along its
computation path.
In the next section, we discuss how to simulate general Turing machines and how to bias transformations in the forwards direction.