Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 [ 93 ] 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175


Figure 7.8: A parse tree in G begins with a parse tree in G and finishes with many parse trees, each in one of the grammars Go

Now, we must prove that this construction works, in the sense that G generates the language s(L). Formally:

A string w is in L(G) if and only if w is in s{L).

(If) Suppose w is in s{L). Then there is some string a: - 102 п in L, and strings Xi in. s{ai) for г = l,2,...,n, such that w = x-iX2--Xn- Then the portion of G that comes from the productions of G with 5д substituted for each a wiU generate a string that looks Uke x, but with Sa in place of each a. This string is Sa Sa Sa This part of the derivation of w is suggested by the upper triangle in Fig. 7,8.

Since the productions of each Ga are also productions of G, the derivation of Xi from So; is also a derivation in G. The parse trees for these derivations are suggested by the lower triangles in Fig. 7.8. Since the yield of this parse tree of G is xiX2 --Xn = iv, we conclude that w is in X(G),

V is the union of V and all the Vas for a in S.

T is the union of all the Ts for a in S.

P consists of:

1. All productions in any Pa, for a in S.

2. The productions of P, but with each terminal a in their bodies replaced by Sa everywhere a occurs.

Thus, all parse trees in grammar G start out like paise trees in G, but instead of generating a yield in S*, there is a frontier in the tree where all nodes have labels that are 5a for some a in S. Then, dangling from each such node is a parse tree of Ga, whose yield is a terminal string that is in the language s(a). The typical parse tree is suggested in Fig. 7.8.



(0nly4f) Now suppose w is in L(G). We claim that the parse tree for w must look hke the tree of Fig. 7.8. The reason is that the variables of each of the grammars G and G for a in £ are disjoint. Thus, the top of the tree, starting from triable S, must use only productions of G until some symbol Sa is derived, and below that Sa only productions of grammar Ga may be used. As a result, wheneлer w has a parse tree T, we can identify a string aio On in L{G), and strings in langnage з{щ}, such that

1. w = X1X2 Xn, and

2. The string SaS * is the yield of a tree that is formed from T by deleting some subtrees (as suggested by Fig. 7.8).

But the string Х]Х2-- is in s(L), since it is formed by substituting strings Xi for each of the os. Thus, we conclude w is in s{L). □

7.3.2 Applications of the Substitution Theorem

There are several familiar closure properties, which we studied for regular languages, that we can show for CFLs using Theorem 7.23. We shall list them all in one theorem.

Theorem 7.24; The context-free languages are closed under the following operations:

1. Union.

2. Concatenation.

3. Closure (*), and positive closure (+).

4. Homomorphism.

PROOF: Each requires only that we set up the proper substitution. The proofs below each involve substitution of context-free languages into other context-free languages, and therefore produce CFLs by Theorem 7.23.

1. Union: Let Li and L2 be CFLs. Then Li U L-2 is the language s(L), where L is the language {1,2}, and s is the substitution defined by s(l) = Li and s(2) = L2-

2. Concatenation: Again let Li and L2 be CFLs. Then L1L2 is the language s{L), where L is the language {12}, and s is the same substitution as in case (1).

3. Closure and positive closure: If Li is a CFL, L is the language {1}*, and s is the substitution s(l) = Li, then L* = s(L). Similarly, if L is instead the language {1} , then L1[ - ${L).



4, Suppose L is a CFL over alphabet S, and ft is a homomorphism on S, Let s be the substitution that replaces each symbol a in S by the Icmguage consisting of the one string that is h(a). That is, s{a) = {h{a}}, for all a in E. Then h(L) = s{L).

7-3.3 Reversal

The CFLs are also closed under reversal. We cannot use the substitution theorem, but there is a simple construction using grammars.

Theorem T.25: If L is a CFL, then so is L.

PROOF: Let L = L{G) for some CFL G = {V,T,P,S). Construct G = {V,T,P,S), where P is the reverse of each production in P. That is, if A a is a production of G, then A -J- is a production of G. It is an easy induction on the lengths of derivations in G and G to show that L{G] = L. Essentially, all the sentential forms of G are reverses of sentential forms of G, and vice-versa. We leave the formal proof as an exercise. □

7.3.4 Intersection With a Regular Language

The CFLs are not closed under intersection. Here is a simple example that proves they are not.

Example 7.26: We learned in Example 7.19 that the language

L{0 r2 n>l}

is not a context-free language. However, the following two languages are context-free:

Li = {0 1 2 I п>1,г>1} La = {01 2 I n > l,i > 1}

A grammar for Li is:

A -+ OAl I 01

B2B \2

In this grammar, A generates all strings of the form 0 1 , and В generates all strings of 2s. A grammar for La is:

SAB Л -> 0Л I 0 5 152 1 12



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 [ 93 ] 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175