Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 [ 58 ] 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

Notation for CFG Derivations

There are a number of conventions in common use that help us remember the role of the symbols we use when discussing CPGs. Here are the conventions we shall use:

1. Lower-case letters near the beginning of the alphabet, o, b, and so on, are terminal symbols. We shall also assmne that digits and other characters such as + or parentheses are terminals.

2. Upper-ease letters near the beginning of the ali)habet, A, D, and so on, are variables.

3. Lower-case letters near the end of the alphabet, such as lu or z, arc strings of toruunals. This con\ention reminds us that the terminals are analogous to the input symbols of an automaton.

4. Upper-case letters neai- the end of the alphabet, such as Л or i, are either terminals or variables.

5. Lower-case Greek letters, such as q and .3, are strings consisting of terminals and/or variables.

There is no special notation for strings that consist of variables only, since this concept plays no important role. However, a string named a or another Greek letter might happen to have only variables.

the symbols and to indicate one or many rightmost derivation steps,

rm rni

respectively. Again, the name of the grammar may appear below these symbols if it is not clcai- which grammar is being used.

Example 5.6 : The derivation of Example 5.5 was actually a leftmost derivation. Thus, we can describe the same derivaticui by:

E E*E I* E a*E

(m hn ill! Im

a*{E) а*{Е-\-Е) a*{I + E) a*{a + E)

Im im i im

a*(a-\-1) a* [a+ 10) a* {a + /00) = a * (я + 600)

hn Im hn

Wc can also summarize th<; leftmost derivation by saying E a* (a-(-600), or

express several steps of the derivation by expressions such as E * £ a* (E).



There is a rightmost derivation that uses the same replacements for each variable, although it makes the replacements in different order. This rightmost derivation is:

E*E E*{E) E*(E + E)=>

rm rm rm rm

E*[E + I) E*(E + IQ) E*(E + m) E*{E + b()0)

rm rm rm rm

E * (/ + 600) E*ia + 600) 7 * (a + 600) a*{a + 600)

rm Tin rm

This derivation allows us to conclude E U a* {a + 600), □

rm

Any derivation has an equivalent leftmost and an equivalent rightmost derivation. That is, if w is a terminal string, and A a variable, then A = w if and only if ,4 w, and A w if and only if Л w. We shall also prove these

1ш rm

claims ui Section 5,2,

5.1.5 The Language of a Grammar

If G(y, T, P, S) is a CFG, the language of G, denoted L{G), is the set of termuial strings that have derivations from the start symbol. That is,

L(G) = {w in T*\S w)

If a language; L is the language of some context-free grammar, then L is said to be a context-frte language, or CFL. For instance, we asserted that the grammai-of Fig, 5.1 defined the language of palindromes over alphabet {0,1}. Thus, the set of palindromes is a context-free language. We can prove that statement, as follows.

Theorem 5.7: L{Gpai), where Gpat is the grammar of Example 5.1, is the set of palindromes over {U, 1}.

PROOF: Wc shall prove that a string w; in (0,1}* is in L{Gpai) if and only if it is a palindrome; i.e., w = ги.

(If) Suppose !/; is a palindrome. We show by induction on that w is in

L{Gpal}-

BASIS: We use lengths 0 and 1 as the basis. If \v)\ = 0 or \ir\ - 1, then w is t, 0, or 1. Since there are productions P -> f, Г 0, and P -> 1, we conclude that P w in any of these basis cases.

INDUCTION: Suppose > 2. Since w = w must begin and end with the same symbol That is, го = 0x0 or ги - 1:1. Moreover, x must be a palindrome; that is, X ~ x. Note that we; need the fact that \w\ > 2 to infer that there are two distinct Os or Гя, at either end of w.



5.1.6 Sentential Forms

Derirations from the start symbol produce strings that have a special role. We call these s(;ntential forms. That is, if G = {V, T, P, S) is a CFG, then any string Q ill (V и Г)* such that S => a is a sentential form. If 5 a, then

a is a left-sentential form, and if 5 => a, then a is a right-sentential form.

Note that the language L{G) is those sentential forms that are in T; i.e., they consist solely of terminals.

Example 5.8: Consider the grammar for expressions from Fig. 5.2. For example, E* (I + E) isd sentential form, since there is a derivation

E=>E*EE*iE)E*{E + E)E*{I + E)

However this deriration is neither leftmost nor rightmost, since at the last step, the middle E is replaced.

As an example of a loft-sentential form, consider a* E, with the leftmost derivation

If v! = Ox\), then we invoke the indnctive hypothesis to claim that P x. Then tliore is a derivation of w from P, namely P => QPO 0x0 = w. If w = 111, the argument is the same, but we use the production P IFl at tlie first step. In either ca.se, we conclude that w is in LiGpai) and complete the proof.

(Only-if) Now, we assume that w is in L{Gpai); that is, P w. We must conclude that is a palindrome. The proof is an induction on the number of steps in a derivation of w from P.

BASIS: If the derivation is one step, then it must use one of the three productions that do not have P in the body. That is, the d<>rivation is P e, P => 0, or P => 1. Since 6, 0, and 1 are all palindrom<!s, the biisis is proven.

INDUCTION: Now, suppose that the derivation takes 7i-\-l steps, where n > 1, and the statement is true for all derirations of n steps. That is, if P i in n steps, then x is a palindrome.

Consider an {n H- l)-step derivation of w. which must be of the form

POPO 0x0 = iv

or P => IPl 1x1 = w, since n + 1 steps is at least two steps, and the producticms P -> OPO and P IPl are the only productions whose use allows additional steps of a derivation. Note that in either case, P x mn steps.

By the inductive hypothesis, we know that x is a palindrome; that is, ar = x. But if so, then 0x0 and 1x1 are also palindromes. For instance, (0x0) = 0x*0 = 0x0, We conclude that w is a palindrome, which completes the proof. □



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 [ 58 ] 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175