Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 [ 86 ] 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

INDUCTION; Suppose the derivation takes n > 1 steps. Then the derition looks like A X1X2 X => w. The first production used must come from

Gi Gi

a production A -> YiY2---Ym, where the Ys are the Xs, in order, with zero or more additional, nullable variables interspersed. Also, we can break w into 1012 - Wk, where Xi => Wi for г = l,2,...,fc. If Xi is a terminal, then

Wi - Xj, and if Xj is a variable, then the derivation Xj Wi takes fewer than n steps. By the inductive hypothesis, we can conclude X,- Wi.

Now, we construct a corresponding derivation in G as follows: A => YiY2---Yjn=> X1X2 Xfc W1W2 Wk -w

G G G

The first step is application of the production A -> У] У2 1, that we know exists in G. The next group of steps represents the deriration of e from each of the Yjs that is not one of the Xs. The final group of steps represents the derivations of the Wis from the Xjs, which we know exist by the inductive hypothesis.

(If) Suppose A w and w Ф We show by induction on the length n of the derivation, that A w.

BASIS: One step. Then A lu is a production of G. Since w ф c, this production is also a production of Gi, and A=> w.

INDUCTION: Suppose the derivation takes n > 1 steps. Then the derivation

looks like A Y\Y2- - Ym w. We can break w = w-wz-- -w., such that G G

Yi Wi ioT г - 1,2,...,m. Let Xi,X,...,Xjt be those of the 1s, in order,

such that Wj Ф e. We must have k>l, since w ф t. Thus, A X1X2 Xjt

is a production of Gi.

We claim that X1X2 X w, since the only У}з that are not present

among the Xs were used to derive e, and thus do not contribute to the derivation of w. Since each of the derivations Yj Wj takes fewer than n steps, we

may apply the inductive hypothesis and conclude that, if Wj ф e, then Yj Wj.

Thus, A X1X2 Xjfe w.

Gi Gi

Now, we complete the proof as follows. We know w is in L{Gi) if and only if S w. Letting A = 5 in the above, we know that w is in L{Gi) if and only

if S => WJ and w Ф e. That is, w is in L{G\) if and only if w is in L{G) and wфe. и

7.1.4 Eliminating Unit Productions

A unit production is a production of the form A B, where both A and В are variables. These productions can be useful. For instance, in Example 5,27, we



saw how using unit productions E T and T -> F allowed us to create an unambiguous grammar for simple arithmetic expressions;

/ a\b\Ia\Ib\m\I\ F I\{E)

T F I T*F

E T\E+T

However, unit productions can complicate certain proofs, and they also introduce extra steps into derivations that technically need not be there. For instance, we could expand the T in production E -> T in both possible ways, replacing it by the two productions E F \ T * F. That change still doesnt eliminate unit productions, because we have introduced unit production E -> F that was not previously part of the grammar. F\irther expanding E F by the two productions for F gives us F - / (F) T * F. We still have a unit production; it is F -> /. But if we further expand this / in all six possible ways, we get

F a I 6 I /a I /6 I /0 I Л I (F) I F * F

Now the unit production for E is gone. Note that £ -> я is not a unit production, since the lone symbol in the body is a terminal, rather than a variable as is required for unit productions.

The technique suggested above - expand unit productions until they disappear - often works. However, it can fail if there is a cycle of unit productions, such as A B, В C, and С A. The technique that is guaranteed to work involves first finding aU those pairs of variables A and В such that Л => F using a sequence of unit productions only. Note that it is possible for Л => В to be true even though no unit productions are involved. For Instance, we might have productions A -> ВС and С -> e.

Once we have determined all such pairs, we can replace any sequence of deriration steps in which .4 Fi F2 F q by a production that uses the nonunit production F -> a directly irom A; that is, A a. To begin, here is the inductive con.4truction of the pairs {A, B) such that A В using only unit productions. Call such a pair a unit pair.

BASIS: (Л, A) is a unit pair for any variable A. That is, Л A by zero steps.

INDUCTION: Suppose we have determined that (A,B) is a unit pair, and F -> С is a production, where С is a variable. Then {A,C) is a unit pair.

Example 7.10: Consider the expression grammar of Example 5.27, which we reproduced above. The basis gives us the unit pairs {E,E), (T, T), (F, F). and (/,/). For the inductive step, we can make the following inferences:

1. (F,F) and the production E T gives us unit pair {E,T).

2. (F, T) and the production T F gives us unit pair (F, F).

3. {F, F) and the production F I gives us unit pair {E,I).



4. (T,T) and the production T F gives us unit pair (T,F).

5. (T,F) and the production F I gives us unit pair (T,/).

6. (F,F) and the production F I gives us unit pair (F,/).

There are no more pairs that can be inferred, and in fact these ten pairs represent all the derivations that use nothing but unit productions. □

The pattern of development should by now be familiar. There is an easy proof that our proposed algorithm does get all the pairs we want. We then use the knowledge of those pairs to remove unit productions from a grammar and show that the language of the two grammars is the same.

Theorem 7.11: The algoritlrm above finds exactly the urut pairs for a CFG G.

PROOF: In one direction, it is an easy induction on the order in which the pairs

are discovered, that if (A,B) is found to be a unit pair, then Л В using

only unit productions. We leave this part of the proof to you.

In the other direction, suppose that A В using unit productions only.

We can show by induction on the length of the derivation that the pair (Л, B) will be found.

BASIS: Zero steps. Then A = B, and the pair {A, B) is added in the basis.

INDUCTION: Suppose A В using n steps, for some n > 0, each step being the application of a urut production. Then the derivation looks like

A CB

The derivation A С takes n - 1 steps, so by the inductive hypothesis, we discover the pair (Л,С). Then the inductive part of the algorithm combines the pair (A,C) with the production С В to infer the pair (Л, B). □

To eliminate unit productions, we proceed as foUows. Given a CFG G - (V,r,F,5), construct CFG Gi - (V,T,Pi,Sy.

1. Find all the unit pairs of G.

2, For each unit pair {A,B), add to Pi all the productions A a, where В a- is a nonunit production in P. Note that Л = В is possible; in that way, Pi contains all the nonunit productions in P.

Example 7.12: Let us continue with Example 7.10, which performed step (1) of the construction above for the expression grammar of Example 5.27. Figure 7.1 summarizes step (2) of the algorithm, where we create the new set of productions by using the first member of a pair as the head and all the nonunit bodies for the second member of the pair as the production bodies.

The final step is to eliminate the unit productions from the grammar of Fig. 7.1. The resulting grammar:



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 [ 86 ] 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175