COT6315 Formal Languages and Theory of Computation

R. E. Newman-Wolfe, University of Florida

Last modified 2/20/95.

Finite State Automata

Before we go further, you may wish to consult some helpful hints on FSAs, Regular Languages and Regular Expressions.

Definition 4.1 - FSA

A finite state automaton (FSA, or finite state machine, FSM, or FA) is a 5-tuple, M = (Q, S, q_0, d, A), where

Q = finite set of states
S = finite input alphabet
q_0 in Q = initial state
d: Q x S -> Q = transition function
A contained in Q = set of accepting states

M starts in state q_0 with its read-only head positioned on the first symbol of its input, w. When M is in state q, and a is the symbol under the read-only head, then M makes a transition to state p and moves its head one symbol to the right, provided that d(q,a) = p. If there is no symbol (i.e., the head has moved off of the right end of of the input string) then M halts, either accepting or rejecting the input string depending on whether the state p into which it moved is in A or not. If there is no transition defined for (q,a) then M rejects the input and halts.

It is useful to extend the transition function d:Q x S -> Q, to operate on strings, d*:Q x S* -> Q.

Definition - d* for FSAs

d*:Q x S* -> Q is recursively defined by:

d*(q,e) = q, for all q in Q,
d*(q,aw) = d*(d(q,a),w), for all q in Q, a in S, and w in S*.

Using this extended transition function, we may now formally define acceptance of a string by an FSA as well as the language it accepts.

Definition - FSA acceptance, language

An FSA M = (Q,S,q_0,d,A) accepts a string w in S* iff d*(q_0, w) is in A. L(M) = {w in S* | d*(q_0, w) is in A}.

Theorem - Relation of FSAs to RLs

A language L is Regular iff L = L(M) for some FSA, M.

Proof

This proof is made considerably easier if we have nondeterminism and e-transitions available, all of which we will develop first before we supply this proof.

In general, we will be proving things about the languages various machines accept, and that can be generated by various other means (such as grammars or regular expressions). To prove a theorem stating the equivalence (in the sense of language recognized or generated), we will first define the given machine (or grammar or expression) formally, then construct another machine (or grammar or expression) based upon the first one. Following this, we will show that the language recognized (or generated) by the first one is contained the that of the second, and vice versa, this demonstrating the equality of the language and the equivalence of the machines (etc.).

Definition - Distinguishing strings with respect to a language

Strings w1 and w2 are distinguishable with respect to language L iff there exists some z in S* such that either

w1z is in L but w2z is not, or
w1z is not in L but w2z is in L.

Otherwise, w1 and w2 are indistinguishable with respect to L.

Intuitively, indistinguishable strings behave the same, and need not be differentiated by any machine that recognizes L, although a particular machine may do so needlessly. This gives us a means of reasoning about FSAs.

Definition - Pairwise distinguishable sets

A subset X of S* is pairwise distinguishable with respect to L iff for any x, y in X, x and y are distinguishable with respect to L. Then X is a pairwise distinguishable set, or a PD set with respect to L.

Theorem - Number of states in any FSA recognizing L

Let X = {w1, w2, ..., wk | for all i,j = 1, 2, ..., k, i<>j => wi and wj are distinguishable with respect to L}, i.e., X is a PD set with respect to L. Then for any M = in FSA such that L(M) = L, |Q| >= |X| = k.

Indirect Proof

Given M = in FSA such that L(M) = L, and X as above, suppose |Q| < k. Since |Q| < |X|, there must be distinct i and j such that d*(q_0,w_i) = d*(q_0,w_j) by the pigeon-hole principle. However, since w_i and w_j are distinguishable with respect to L, there must be some z such that w_i.z is in L but w_j.z is not, or w_i.z is not in L but w_j.z is in L. Without loss of generality, assume the former. Then d*(q_0,w_i.z) is in A, but d*(q_0,w_j.z) is not in A. But since d*(q_0,w_i) = d*(q_0,w_j), d*(q_0,w_i.z) = d*( d*(q_0,w_i), z) = d*( d*(q_0,w_j), z) = d*(q_0,w_j.z) must both be in A and not be in A, a contradiction. Thus our premise that |Q| < |X| must be false.

QED

This result will allow us to show that a language L is not regular by showing that there is an infinite set of strings that are mutually distinguishable with respect to L.

Theorem - More Closure Properties of RLs

If L1 and L2 are in RL(S), then so are

L1 U L2
L1 ^ L2
L1 - L2
L1'
L1*

Proof

Let M1, M2 be in FSA, L(M1) = L1, L(M2) = L2, M1 = and M2 = . For the first three, we will construct M' = , with Q' = Q X P, q_0' = (q_0,p_0), d' as defined below, and A' depending on the operation of interest (union, intersection, or difference). Define d'((q,p), a) = (q',p') where q' = d1(q,a) and p' = d2(p,a).

For L1 U L2, define A' = (A1 X P) U (Q X A2), so that if either M1 or M2 would have accepted, then so will M' and vice versa.

For L1 ^ L2, define A' = (A1 X A2), so that if both M1 or M2 would have accepted, then so will M' and vice versa.

For L1 U L2, define A' = (A1 X P) - (Q X A2), so that if M1 would have accepted and M2 rejected, then M' will accept and vice versa.

For the last two, we need only M1 to construct M' = .

For L1', d' = d1, and A' = A1' (the complement of A1). Thus if M1 accepts, M' rejects, and if M1 rejects, then M' accepts.

For L1*, it will be easiest if we have e-transitions (so we can connect accepting states of M1 to q_0 without consuming any input to form M'). These haven't been discussed yet, so we defer.

QED the first 4.

Definition - NFSA

A nondeterministic finite state automaton (NFSA, or NFSM, or NFA) is also a 5-tuple, M = (Q, S, q_0, d, A), where

Q = finite set of states
S = finite input alphabet
q_0 in Q = initial state
d: Q x S -> P(Q) = nondeterministic transition function
A contained in Q = set of accepting states

P(Q) is the power set of Q, that is, the set of all subset of Q. In other words, from a given state q, reading an input symbol a, M may move into any of the states in the set d(q,a). M starts in state q_0 with its read-only head positioned on the first symbol of its input, w. When M is in state q, and a is the symbol under the read-only head, then M makes a transition to any state p in d(q,a) and moves its head one symbol to the right.

Intuitively, a nondeterministic machine M has a "choice" whenever |d(q,a)| > 1, and M always "chooses right." Otherwise, a nondeterministic machine works like a DFSA.

Again, it is useful to extend the transition function d:Q x S -> P(Q), to operate on strings, d*:Q x S* -> P(Q).

Definition - d* for NFSAs

d*:Q x S* -> P(Q) is recursively defined by:

d*(q,e) = {q}, for all q in Q,
d*(q,wa) = U {d(p,a) | p in d*(q,w)} for all q in Q, a in S, and w in S*.

Further, it is useful to extend the transition function d:Q x S -> P(Q), to operate on sets of states and strings, d~:P(Q) x S* -> P(Q). This will help us reason about the behavior of NFSAs, and NFSAs with e-transitions (see below).

Definition - d~ for NFSAs

d~:P(Q) x S* -> P(Q) is recursively defined by:

d~(B,e) = B, for all B contained in Q,
d~(B,a) = {p | there exists q in B such that p is in d(q,a)} for all B contained in Q, and a in S,
d~(B,wa) = d~(d~(B,w),a) for all B contained in Q, a in S, and w in S*.

Lemma

For NFSA or FSA M with transition function d,

d~({q},w) = d*(q,w) for all q in Q and w in S*.

Proof

We use induction on the length of w, |w| = k.

Base cases: If k = 0 then w = e and d*(q,w) = {q} = d~({q},w) by definition. If k = 1 then w = a for some a in S and d*(q,w) = d(q,a) = d~({q},a) by definition.

Inductive Hypothesis: Suppose for some k, |w|=k => d*(q,w) = d~({q},w).

Inductive Step: Show that for k+1, |w|=k+1 => d*(q,w) = d~({q},w). Let |w| = k+1, and let w = xa, so |x| = k. Then d*(q,w) = d*(q,xa) = U { d(p,a) | p is in d*(q,x)} by definition. Since d*(p,x) = d~({p},x) by the inductive hypothesis, d*(q,w) = U {d(p,a) | p in d*(q,x)} = U {d~({p},a) | p in d~(q,x)} = d~(d~({q},x),a) = d~({q},xa)

QED.

Definition - NFSA acceptance, language

An NFSA M = (Q,S,q_0,d,A) accepts a string w in S* iff d*(q_0, w) intersects A. L(M) = {w in S* | d*(q_0, w) ^ A <> 0}.

The intuition that M always "chooses right" reflects the criterion that d*(q_0, w) ^ A is non-empty. In other words, there is one or more sequence of states from the valid transitions of M on w that allows M to end up in an accepting state when the input w has been consumed.

Theorem - Equivalence of NFSA's and FSAs

Given M in NFSA with L = L(M), then there exists a machine M' in FSA such that L(M') = L.

Proof

The task is to handle the fact that M can reach a number of states on a given string from a given start state. Intuitively, the new machine M' must somehow keep track of all the states that M could be in after reading a prefix of its input, that is, M' should have as states all reachable subsets of states of M. The transition function must reflect this change in the state set as well, for which d~ will come in handy. The accepting states must be defined for M', and these will be those subsets of P(Q) that intersect A. Finally, the start state is simply the singleton set {q_0}, reflecting that M starts in the start state and that it can't reach any other states without consuming input.

Construction of M'

Formally, let M= and M'=, where Q'=P(Q), S'=S, q_0'={q_0}, and A'={X contained in Q | X ^ A <> 0}. The transition function d':P(Q) X S -> P(Q) must account for all transitions from all states in the subset of Q, so define

d'(X,a) = d~(X,a), where d~ is based on d, and X is in Q'=P(Q).

(Here, we treat a as symbol for d' and as a string for d~.) This definition of the new transition function d' accounts for all states of M that can be reached from any state in X on the input a, which is exactly what is desired. For completion of the proof, we should show that L(M) is contained in L(M') and that L(M') is contained in L(M). For this we will use the fact that

d'({q_0}, w) = d~({q_0},w) = d*(q_0,w).

Claim: L(M) = L(M')

The following statements are equivalent (with justifications interspersed).

w is a string in L(M).
(by definition of NFSA acceptance)
d*(q_0,w) ^ A <> 0
(by construction of A')
d*(q_0,w) is in A'.
(by the lemma above)
d~({q_0},w) = d*(q_0,w) is in A'.
(by construction of q_0')
d~(q_0',w) = d~({q_0},w) is in A'.
(by construction of d', and that d~ is its own closure)
d'*(q_0',w) = d~(q_0',w) is in A'.
(by definition of FSA acceptance)
w is in L(M').

QED.

One more generalization of FSAs is very useful: permitting an NFSA to change states without consuming any input symbol. This is called an epsilon-transition, or a lambda-transition, and such a machine is called an NFSA-e.

Definition - NFSA-e

A nondeterministic finite state automaton with epsilon-transitions (NFSA-e, etc., or NFSA-lambda, etc.) is a 5-tuple, M = (Q, S, q_0, d, A), where

Q = finite set of states
S = finite input alphabet
q_0 in Q = initial state
d: Q x (S U {e}) -> P(Q) = nondeterministic transition function
A contained in Q = set of accepting states

M starts in state q_0 with its read-only head positioned on the first symbol of its input, w. When M is in state q, and a is the symbol under the read-only head, then M makes a transition to any state p in d(q,a) and moves its head one symbol to the right, or M makes a transition to any state p' in d(q,e) and does not move its head at all.

Once again, we would like to be able to use d* and d~ to reason about the operation of NFSA-e machines. However, this is complicated by the ability of such machines to change states without consuming input. It is handy to be able to refer to all the states to which an NFSA-e can transition without moving its head from some state q or set of states X.

Definition - e-closure of X

Given an NFSA-e, M = (Q, S, q_0, d, A), and a set of states X contained in Q, the e-closure of X is e(X) = d~(X,e*)}.

Note that here we use d~ and treat e* as a string (abusing the notation for d~ slightly) so that the e-closure of X includes all states that can be reached by zero or more e-transitions from a state in X. We may also define the e-closure by defining a sequence of sets, X0 = X, X1 = X0 U d~(X0,e), ..., Xi+1 = Xi U d~(Xi,e), and observing that since the sequence is monotone nondecreasing (in the sense that for all i, Xi is contained in Xi+1) and bounded by Q, there must be a fixed point, i.e., for some k, Xk+j = Xk for all j > 0. The e-closure of X is this fixed point.

Now we can define d* more appropriately, which will allow us to define acceptance.

Definition - d* for NFSA-e's

The closure of the transition function, d*:Q x S* -> P(Q), is recursively defined for an FNSA-e by:

d*(q,e) = e({q}), for all q in Q,
d*(q,wa) = e[ d~( d*(q,w),a) ] for all q in Q, a in S, and w in S*.

The definition of d* accounts for any e-transitions that may take place before the last symbol is consumed, since d* includes the e-closure of any state reached after consumption of a symbol as well as the e-closure of the intial state. The d~ function operates on the set of states that can be reached this way, as defined above, and consumes the single symbol string a. The e-closure is then taken to account for any e-transitions taken after the last symbol is consumed.

Definition - NFSA-e acceptance, language

An NFSA-e M = (Q,S,q_0,d,A) accepts a string w in S* iff d*(q_0, w) intersects A. L(M) = {w in S* | d*(q_0, w) ^ A <> 0}.

Theorem - Equivalence of NFSA-e's and NFSAs

Given M in NFSA-e with L = L(M), then there exists a machine M' in NFSA such that L(M') = L.

Proof

The task is to eliminate the e-transitions, which can be done intuitively by extending the transition function to take a state on a given symbol to all the states the machine can reach through taking a combination of any number of e-transitions along with one transition consuming the input symbol. There is one difficulty, and that is the initial state. If it is not in A for M, but e(q_0) intersects A, then q_0' must be in A'. Otherwise, A and A' are identical.

Formally, let Q'=Q, S'=S, q_0'=q_0, and A'=A U {q_0} if e(q_0) ^ A <> 0, or A' = A otherwise. The transition function d' must account for any possible e-transitions, so define

d'(q,a) = d*(q,a).

In this way, all transitions are labeled by one or more symbols, and any state that could be reached by M taking e-transitions, then consuming a, then taking additional e-transitions, is included in d'(q,a).

For completion of the proof, we should show that L(M) is contained in L(M') and that L(M') is contained in L(M). This is very straight-forward, and follows a similar approach to that of the theorem of equivalence of NFSAs and FSAs.

QED

Now that we have NFSA-e's, and know that they recognize exactly the same class of languages that FSAs recognize, we can use them to show that a language (or a class of languages) is regular by constructing an NFSA-e that recognizes that language (class). Our first use of this is in part I of Kleene's Theorem, proving part of the theorem only stated earlier, that a language is regular iff there is an FSA that recognizes it.

Theorem - Kleene's Theorem, Part I

For any regular language, L, there is an FSA, M such that L = L(M).

Proof

As usual, we will prove this by induction based on the definition of regular expressions. To do this, we will build an NFSA-e recursively for each expression, and appeal to the fact that for any NFSA-e there is an FSA that recognizes the same language.

More formally, for any L in RL, there is a regular expression r such that L = L(r). We then show that for any regular expression r, there is an NFSA-e, M, such that L(M) = L(r). Further, this M = will have |A| = 1 (which makes the pictures easier to draw).

Base Cases

There are three base cases: r = 0, r = e, and r = a for some a in S.

For r = 0, let Q = {q_0, q_1}, A = {q_1}, d(q_0,a) = q_0 for all a in S. Then M starts in a non-accepting state, and on any input (or no input) M remains in a non-accepting state, so L(M) = {w | d*(q_0,w) is in A} = 0. Also, |A| = 1.
For r = e, let Q = {q_0, q_1}, A = {q_0}, d(q_0,a) = d(q_1,a) = q_1 for all a in S. Then M starts in an accepting state, and on any input M transitions to a non-accepting state, so L(M) = {w | d*(q_0,w) is in A} = e. Also, |A| = 1.
For r = a for some a in S, let Q = {q_0, q_1, q_2}, A = {q_1}, d(q_0,a) = q_1, d(q_0,b) for all b in S, b <> a, and d(q_1,s) = d(q_2,s) = q_2 for all s in S. Then M starts in a non-accepting state, and on any input other than a M transitions to a dead state. On the single character a, M transitions to the accepting state, but if any more characters are consumed, M transitions to the dead state q_2 and remains there on any further input. L(M) = {w | d*(q_0,w) is in A} = {a}, and |A| = 1.

QED base cases.

Inductive Hypothesis

For any regular expression r with |r| = the number of operators present in r < k, there exists an NFSA-e, M = with |A| = 1 such that L(M) = L(r).

Inductive Step

There are again three cases: r = s.t, r = s+t, and r = s*, where |s|, |t| < |r|, since there is at least one more operator in r than in either of the component regular expressions s or t. Let Ms = and Mt = . Then S = Ss U Sr, A = {q_a}, with the rest of M defined below, depending upon the case. A dead state may be added in all cases, to which M will transition if q_a is reached before all the input has been consumed. Instead of including the dead state in the figure or in the constructions below, we will allow the transition function to be partial, with no transitions defined out of q_a (hence, if there is input left when M enters q_a, M will reject the input string). When defining Q, the sets of states of the constituent machines, Ms and Mt, will be included, so a cross product with a label indicating the source of the state is used to ensure that all the subsets of states (those from Q_s, those from Q_t, and the new ones) are disjoint.

For r = s.t, let Q = Q_s X {s} U Q_t X {t} U {q_0, q_a}. Let d be defined by the following:
- d(q_0,e) = {(q_s,s)},
- d(q,a) = d_s(q,a) for all q in (Q_s-A_s) X {s} and a in S U {e},
- d(q,a) = d_t(q,a) for all q in (Q_t-A_t) X {t} and a in S U {e},
- d(q,a) = d_s(q,a) for q in A_s X {s} and a in S,
- d(q,a) = d_t(q,a) for q in A_t X {t} and a in S,
- d(q,e) = d_s(q,e) U {(q_t,t)} for q in A_s X {s},
- d(q,e) = d_t(q,e) U {q_a} for q in A_t X {t},
Let the input to M be w in S*. M must move to the conscripted start state of the captive Ms before consuming input, and may only leave the captive Ms when the prefix of the input that has been consumed is in L(Ms). Let w = xy, where x is in L(Ms) is the input consumed by M when it leaves the captive Ms. When M leaves the the captive Ms, it enters the start state of the captive Mt without consuming any input, and may only leave the captive Mt when the prefix of the remaining input, y, that has been consumed is in L(Mt). If M leaves the captive Mt without consuming all of y, then it must reject w since M cannot consume any input outside of the captive machines, Ms and Mt. Therefore, M will accept w iff there are x and y such that w = xy, x is in L(Ms) and y is in L(Mt), so L(M) = L(Ms).L(Mt). Since by the inductive hypothesis, L(Ms) = L(s) and L(Mt) = L(t), L(M) = L(r) = L(s.t) = L(s).L(t) and is the concatenation of two regular languages, which means that L(M) is also a regular language.
For r = s+t, let Q = Q_s X {s} U Q_t X {t} U {q_0, q_a}. Let d be defined by the following:
- d(q_0,e) = {(q_s,s),(q_t,t)},
- d(q,a) = d_s(q,a) for all q in (Q_s-A_s) X {s} and a in S U {e},
- d(q,a) = d_t(q,a) for all q in (Q_t-A_t) X {t} and a in S U {e},
- d(q,a) = d_s(q,a) for q in A_s X {s} and a in S,
- d(q,a) = d_t(q,a) for q in A_t X {t} and a in S,
- d(q,e) = d_s(q,e) U {q_a} for q in A_s X {s},
- d(q,e) = d_t(q,e) U {q_a} for q in A_t X {t},
Let the input to M be w in S*. M must move either to the start state of the captive Ms or to the start state of the captive Mt before consuming input. Once this move has been made, the only way to leave the captive machine is if the prefix of the input that has been consumed is in the language of the captive machine entered. If M leaves the captive machine without consuming all of w, then it must reject w since M cannot consume any input outside of the captive machines, Ms and Mt, and once M has left the captive machine's states, M can never reenter either captive machine. Therefore, M will accept w iff w is in L(Ms) or w is in L(Mt), so L(M) = L(Ms) U L(Mt). Since by the inductive hypothesis, L(Ms) = L(s) and L(Mt) = L(t), L(M) = L(r) = L(s+t) = L(s) U L(t) and is the union of two regular languages, which means that L(M) is also a regular language.
For r = s*, let Q = Q_s X {s} U {q_0, q_a}. Let d be defined by the following:
- d(q_0,e) = {(q_s,s)},
- d(q,a) = d_s(q,a) for all q in (Q_s-A_s) X {s} and a in S U {e},
- d(q,a) = d_s(q,a) for q in A_s X {s} and a in S,
- d(q,e) = d_s(q,e) U {(q_s,s), q_a} for q in A_s X {s},
Let the input to M be w in S*. M must first move either to the start state of the captive Ms or to the accepting state before consuming input. The latter move allows it to accept L^0 = {e}, while the former allows it to accept L^k for any k>0. Once this initial move has been made, the only transitions that may be taken that are not defined by captive machine are the two e-transitions from the accepting state of the captive Ms. The accepting state of the captive Ms is only reached the first time if the prefix of the input that has been consumed is in the language of the captive Ms. At this point, M may either e-transition to the accepting state and halt (rejecting there is any input left or accepting if it has all been consumed) or M may e-transition (call this the restart transition) to the start state of the captive Ms and continue to consume input. This first pass allows M to accept L, while k repetitions of the computation of Ms from its start state after leaving the final state of Ms by takgin the restart transition k-1 times allows M to accept L^k. On subsequent visits to the accepting state of the captive Ms, the prefix of the input that has been consumed must be in L(Ms)*. If M leaves the captive machine without consuming all of w, then it must reject w since M cannot consume any input outside of the captive machine, and once M has left the captive machine's states, M can never reenter the captive machine. Therefore, M will accept w iff w is in L(Ms)*, so L(M) = L(Ms)*. Since by the inductive hypothesis, L(Ms) = L(s), L(M) = L(r) = L(s*) = L(s)* and is the Kleene closure of a regular language, which means that L(M) is also a regular language.

QED

The following figures show the construction pictorially.

Theorem - Kleene's Theorem, Part II

For any FSA, M, L(M) is a regular language.

Proof

This proof is slightly tricky, since the natural inclination is to induct on the number of states in M, which turns out to be somewhat complicated (you have somehow to show that you can build a machine with N states out of machines with fewer states - think about it). Instead, it is easier to consider an arbitrary machine and consider its behavior.

Let M = be an arbitrary FSA (note that we number the states in Q starting with 1, so q_1 will serve as our start state). We are interested in L(M) = {w in S* | d*(q_1,w) is in A}. Since {w in S* | d*(q_1,w) is in A} = U {w | d*(q_1,w) = p} [p in A], and A is finite (Q is finite and A is contained in Q), we need only show that for all p in A, {w | d*(q_1,w) = p} is regular, then appeal to the theorem that a finite union of regular sets is also regular. (Here, we use the square bracket notation suggested by Knuth to indicate that this is a union over all p such that p is in A of the set described by the set former.) These sets are special cases of L_j(M)={w | d*(q_1,w) = q_j} for any q_j in Q. Clearly, the starting point can also be parameterized, to make these sets special cases of a more general set of languages, L_{i,j}(M) = {w in S*| d*(q_i,w) = q_j}, for q_i, q_j in Q. Then L_j(M) = L_{q_1,q_j}(M). Finally, an even more specific set of languages is defined by L_{i,j}^{k}(M) = {w in S*| d*(q_i,w) = q_j and for all x, y in S* such that |x|,|y|>0 and w = xy, d*(q_i,x)=q_j' => j'<=i}, for i,j,k <= |Q|. These can be considered languages that correspond to computational paths in M that take M from one state, q_i, to another, q_j, without going through any state numbered higher than the parameter k. Note that either or both of i and j may be greater than k, and that L_{1,j}{|Q|}(M) = L_{q_1,q_j}(M) = L_j(M).

Why in the world have we gone to such trouble? The approach is similar to that of Warshall's algorithm for finding the all pairs of shortest paths in a graph, using dynamic programming. Here, instead, we will induct on the bound on the maximum state index of the computational path taken by M on its input to show that the languages corresponding to these (restricted) computational paths are all regular.

Base Case

L_{i,j}^{0} = {a in S | d(q_i,a) = q_j} if i <> j, or L_{i,j}^{0} = {a in S | d(q_i,a) = q_j} U {e} if i = j. In either case, L_{i,j}^{0} is finite, and so is regular. (Notice that we do not have to deal with repeated characters, since there is no state with index 0, so the path must be degenerate, i.e., have no intermediate states at all).

Inductive Hypothesis

For some k, 0 <= k < |Q|, L_{i,j}^{k} is regular for all i,j, 0 < i,j <= |Q|.

Inductive Step

Consider L_{i,j}^{k+1}. We claim that L_{i,j}^{k+1} = L_{i,j}^{k} U L_{i,k+1}^{k}.L_{k+1,k+1}^{k}*.L_{k+1,j}^{k}. To see this, notice that the first term covers the case in which the computational path never passes through a state numbered higher than k-1 at all, and the second term covers the case in which the computational path does pass through state q_k+1 at least once. If the computational path does pass through state q_k+1 at least once, then the path must consist of an initial part that starts in state q_i and leaves M in state q_k+1 for the first time, a middle part where M leaves and returns to state q_k+1 zero or more times, and a final part where M leaves state q_k+1 for the last time and ends up in state q_j. Each of these parts corresponds to a part of the input string w, so w = xyz where x is in L_{i,k+1}^{k} and z is in L_{k+1,j}^{k}. Suppose y takes M from state q_k+1 back to state q_k+1 m times, m >= 0. Then y = y_1.y_2....y_m, where for each y_i, y_i is in L_{k+1,k+1}^{k}. Therefore y is in L_{k+1,k+1}^{k}*, and w is in L_{i,k+1}^{k}.L_{k+1,k+1}^{k}*.L_{k+1,j}^{k}. By the inductive hypothesis, each of the component languages is regular, so the Kleene closure of the middle one is regular, and the concatenation of the three parts is also regular.

QED

The following figures illustrate the computational paths graphically, the first two doing so in general, and the third giving a concrete example.

Definition - Indistinguishable with respect to L relation

For L in RL(S), let I_L = { (x,y) in S^2 | x and y are indistinguishable with respect to L}.

Lemma: I_L is an equivalence relation on S*

For L in RL(S), I_L is an equivalence relation on S*.

Proof

We need to show that I_L is reflexive, symmetric, and transitive. Clearly, for any x, y in S*, x I_L x and x I_L y implies y I_L x, so the first two properties are easily shown. For transitivity, let x, y, z in S* be such that x I_L y and y I_L z. Then for any w in S*, xw in L <=> yw in L, and yw in L <=> zw in L. Therefore, xw in L <=> zw in L, and we are done.

QED

This indistinguishable relation, as an equivalence relation, allows us to partition all the strings of S* into the equivalence classes I_L defines. The equivalence class for a string x will be denoted [x]_I_L. The strings in an equivalence class may be thought of as prefixes of words that may or may not be in L. For any two strings in an equivalence class, for any suffix, either the strings formed by concatenating each prefix with the suffix are both in L or both not in L, by the definition of I_L and indistinguishability. Thus, if there is an FSA that accepts L, it need have only one state for each equivalence class, since it must behave the same for every string that is in that equivalence class on all suffixes.

Definition: Maximal Pairwise Distinguishable Sets

A subset X contained in S* is a maximal PS set with respect to a language L iff for any y in S*, there is an x in X such that x and y are indistinguishable with respect to L.

Lemma: Maximal PD sets related to I_L

A set X is a maximal PD set with respect to L iff for each x in S*, X ^ [x]_I_L <> 0.

Proof

Easy.

Lemma: I_L is right invariant

I_L is right invariant with respect to concatenation, i.e., if [x]_I_L = [y]_I_L, then for any a in S, [xa]_I_L = [ya]_I_L.

Proof

Easy.

Theorem: Minimal FSA for L

Let L be a language with a finite maximal PD set with respect to L, X. Then there is an FSA, M = , such that |Q| = |X|, and L(M) = L. Further, no other FSA recognizing L has fewer states than M.

Proof

Construction of M. Let L be a language with a finite maximal PD set with respect to L, and let X be that maximal PD set. Define M = where Q = { [x] | x is in S*}, q_0 = [e], d([x],a) = [xa], and A = { [x] | x is in L }. For X to be maximal, for any w in S*, [w] ^ X <> 0, which implies that Q is finite and |Q| = |X| (by the lemma above). Further, from the theorem on the number of states required for any FSA recognizing L, M has the minimum number of states possible. Since I_L is right invariant with respect to concatenation, the transition function is consistent.

Claim: x in L(M) => x in L.

Let x be in L(M) => d*(q_0,x) is in A = { [w] | w is in L }. But d*(q_0,x) = d*([e],x) = d*([x],e) = [x]. Therefore [x] in A => x is in L, so L(M) is contained in L.

Claim: x in L => x in L(M).

Let x be in L. Then d*(q_0,x) = d*([e],x) = [x], as above. If x is in L then [x] is in A, so x is in L(M) and L is contained in L(M).

QED

This next theorem gives us both a nice closure to all of the preceding bits and pieces, and a way to determine whether a language is regular or not.

Theorem: Myhill-Nerode

A language L is in LR iff the number of equivalence classes in S* induced by I_L is finite. L is nonregular iff there is an infinite subset of S* such that any two elements in the subset are indistinguishable with respect to L.

Proof

Sketch: M in FSA need only have as many states as there are equivalence classes induced by I_L, as shown in the theorem above. The construction of the FSA shows by theorem ?.? that L is regular.

Conversely, if a maximal PD set for L is infinite, hence the number of equivalence classes in S* induced by I_L is infinite, then by theorem ?.? the minimum number of states for any FSA recognizing L is infinite, so there is no finite state machine that recognizes L. Thus L is not regular.

QED

We now have several ways to show that a language is regular: we can give a regular expression for it, give an FSA (or NFSA-e, etc.) for it, or we can show that the number of equivalence classes in S* induced by I_L is finite and appeal to the Myhill-Nerode Theorem. While the Myhill-Nerode Theorem also allows us to show that a language is not regular by demonstrating an infinite PD set for L, it is handy to have other methods to show that a language is not regular.

Theorem: Pumping Lemma

The Pumping Lemma for Regular Languages should NEVER be used to attempt to show that a language is regular. You can think of it as an adversarial arguement in which a demon claims a language to be regular, and you ask for N. The demon gives you N, and you provide a string w in L with |w| >= N, and demand that the demon break w into u, v and x as in the lemma. When the demon gives you satisfactory u, v, and x, you then show that there is an i for which u.v^i.x is not in L, so the language is not regular and the demon is vanquished, snarling, into the darkness. Of course, since you have no adversary, you will need to show that no matter what the hypothetical demon's choice of N is, you can pick a w such that no matter what the hypothetical demon's choice of u, v, and x are, you can always pick an i such that u.v^i.x is not in L.

Theorem: Ogden's Lemma

Last modified 1/30/95. Send comments to nemo@cis.ufl.edu