Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 [ 92 ] 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

1, then we note that t, which is of length at least 3n/2, mnst end in 1 , because uwy ends in 1 . However, there is no block of n Vs except the final block, so t cannot repeat in uv;y.

3. If vwx is contained in the first block of Is, then the argument that uwy is not in L is like the second part of case (2).

4. Suppose vwx straddles the first block of Is and the second block of Os. If vx actually has no Os, then the argument is the same as if vwx were contained in the first block of Is. If vx has at least one 0, then uwy starts with a block of n Os, and so does t if uwy = tt. However, there is no other block of n Os in uwy for the second copy of t. We conclude in this case too, that uwy is not in L.

5. In the other cases, where vwx is in the second half of z, the argument is symmetric to the cases where vwx is contained in the first half of z.

Thus, in no case is uwy in i, and we conclude that L is not context-free. □ 7.2.4 Exercises for Section 7.2

Exercise 7.2.1: Use the CFL pumping lemma to show each of these languages not to be context-free:

* a) {aVc* \ i<j<k}.

b) {a b c \i<n}.

c) {0 I p is a prime}. Hint: Adapt the same ideas used in Example 4.3, which showed this language not to be regular.

*! d) (OF I 3 = г}.

! e) {a b c n < г < 2n}.

! f) {www I is a string of Os and Is}. That is, the set of strings consisting of some string w followed by the same string in reverse, and then the string w again, such as 001100001.

! Exercise 7.2.2: When we try to apply the pumping lemma to a CFL, the adversary wins, and we cannot complete the prooL Show what goes wrong when we choose L to be one of the following languages:

a) {00,11}.

b) {Ol I n > 1}.

* c) The set of palindromes over alphabet {0,1}.



! Exercise 7.2.3: There is a stronger version of the CFL pumping lemma known aa Ogdens lemma. It differs from the pumping lemma we proved by allowing us to focus on any n distinguished positions of a string z and guaranteeing that the strings to be pumped have between 1 and n distinguished positions. The advantage of this ability is that a language may have strings consisting of two parts, one of which can be pumped without producing strings not in the language, while the other does produce strings outside the language when pumped. Without being able to insist that the pumping take place in the latter part, we cannot complete a proof of non-context-freeness. The forma! statement of Ogdens lemrna is: If L is a CFL, then there is a constant n, such that if z is any string of length at least n in L, in which we select at least n positions to be distinguished, then we can write z = uvwxy, such that:

1. vwx has at most n distinguished positions.

2. vx has at least one distinguished position.

3. For all i, uvwxy is in L.

Prove Ogdens lemma. Hint: The proof is really the same as that of the pumping lemma of Theorem 7.18 if we pretend that the nondistinguished positions of z are not present as we select a long path in the parse tree for z.

* Exercise 7.2.4: Use Ogdens lemma (Exercise 7.2.3) to simplify the proof in Example 7.21 that L {ww ш is in {0,1}*} is not a CFL. Hint: With z = 0 1 0 1 , make the two middle blocks distinguished.

Exercise 7.2.5: Use Ogdens lemma (Exercise 7.2.3) to show the following languages are not CFLs:

! a) {OPO I jmax(i,k)}.

!! b) { fe c* I i Ф n}. Hint: If n is the constant for Ogdens lemma, consider the string z - b c .

7.3 Closure Properties of Context-Free Languages

We shall now consider some of the operations on context-free languages that are guaranteed to produce a CFL. Many of these closure properties will parallel the theorems we had for regular languages in Section 4.2. However, there are some differences.

First, we introduce an operation called substitution, in which we replace each symbol in the strings of one language by an entire language. This operation, a generalization of the homomorphism that we studied in Section 4.2.3, is useful In proving some other closure properties of CFLs, such as the regular-expression operations: union, concatenation, and closure. We show that CFLs are closed



under honQOmorphisrns and inverse homomorphisms. UnHke the regular languages, the CFLs are not closed under intersection or difference. However, the intersection or difference of a CFL and a regular language is always a CFL.

7.3.1 Substitutions

Let S be an alphabet, and suppose that for every symbol a in S, we choose a language La- These chosen languages can be over any alphabets, not necessarily I! and not necessarily the same. This choice of languages defines a fimction s (a substitution) on S, and we shall refer to La as s{a) for each symbol a.

li w ~ tiifli is a string in S*, then s(w) is the language of all strings xiX2 --Xn such that string Xj is in the language (а;), for г = 1,2,.. Put another way, ,4{w) is the concatenation of the languages s{ai)s(a2) s(an)-We can further extend the definition of s to apply to languages: s{L) is the union of s{w) for all strings w m L.

Example 7.22: Suppose 5(0) = {a b n > 1} and s(l) = {aa,bb}. That is, s is a substitution on alphabet S = {0, !} Language .s(0) is the set of strings with one or more as followed by an equal number of ts, while .s(l) is the finite language consisting of the two strings aa aud bb.

Let w - 01. Then s{w) is the concatenation of the languages s(0)s(l). To be exact, s{w) consists of all strings of the forms a 6 ял and where

n > 1.

Now, suppose L = L{0*). that is, the set of all strings of Os. Then s{L) = (5(0))*. This language is the set of all strings of the form

for some к >0 and any sequence of choices of positive integers п ,П2,... <Пк. It includes strings such as e, aabbaaabbb, and abaabbabab. П

Theorem 7.23: If L is a context-ftee language over alphabet 2, and 5 is a substitution on S such that s{a) is a CFL for each a in S, then 5(L) is a CFL.

PROOF: The essential idea is that we may take a CFG for L and replace each terminal a by the start symbol of a CFG for language s{a). The result is a single CFG that generates s(L). However, there are a few details that must be gotten right to make this idea work.

More formally. Start with grammars for eacji of the relevant languages, say G = (V,5:,P,5) for L and G = (l;, Г ,Р , S ) for each a in S. Since we can choose any names we wish for variables, let us make sure that the sets of variables are disjoint; that is, there is no symbol A that is in two or more of V and any of the Vas. The purpose of this choice of names is to make sure that when we combine the productions of the various grammars into one set of productions, we cannot get accidental mixing of the productions from two grammars and thus have derivatitms that do not resemble the derivations in any of the given grammars.

We construct a new grammar G = {V,T, P\ S) for s{L), as follows:



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 [ 92 ] 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175