Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 [ 88 ] 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

->

\TC2\LCi\a\b\ IA \IB\IZ\IO

->

TC2\

LCi\a\b\ IA\ IB \ IZ \ 10

LCz 1

a\b\IA\IB\IZ\IO

\ IA \ IB \ IZ \ 10

Figure 7.3: Making all bodies either a single terminal or two riables

Theorem 7.16: If G is a CFG whose language contains at least one string other than e, then there is a grammar G\ in Chomsky Normal Form, such that L(Gi)-L(G)-H-

PROOF: By Theorem 7.14, we can find CFG G2 such that L{G-2) L{G) - {e}, and such that G2 has no useless symbols, e-productions, or unit productions. The construction that converts G2 to CNF grammar Gi changes the productions in such a way that each production of G] can be simulated by one or more productions of G- Conversely, the introduced variables of G2 each have only one production, so they can only be used in the manner intended. More formally, we ргол-е that w is in L{G2) if and only if w is in L{G\).

Now, all productions are in Chomsky Normal Form except for those with the bodies of length 3; EFT, TMF, and LER. Some of these bodies appear in more than one production, but we can deal Mdth each body once, introducing one extra triable for each. For EFT, we introduce new rariable Ci, and replace the one production, E -> EFT, where it appears, by E -л EC\ and Ci FT.

For TMF we introduce new variable Сг- The two productions that use this body, E -) TMF and T -> TMF, are replaced by F -> TGi, T TC2, and C-2 -J- MF. Then, for LER we introduce new variable C3 and replace the three productions that use it, E LER, T -) LER, and F LER, by E LC-i, T LC-i, F LCz, and C3 - ER. The final grammar, which is in CNF, is shown in Fig. 7.3. □



7.1-6 Exercises for Section 7.1 Exercise 7.1,1: Find a grammar equivalent to

S AB\CA

В -> ВС \AB

С -+ аВ\Ь

with no useless symbols.

(Only-if) If w has a derivation in G3, it is easy to replace each production used, say A X\X2 - -X/, by a sequence of productions of Gi. That is, one step in the derivation in G2 becomes one or more steps in the derivation of w using the productions of Gi. First, if any X is a terminal, we know G] has a corresponding variable Bi and a production Bi Xi. Then, if A; > 2, Gi has productions A BCi, C\ -л B2C2, and so on, where Д- is either the introduced variable for terminal X or Xi itself, if Xi is a variable. These productions simulate in Gi one step of a derivation of G2 that uses A -) X1X2 Xk. We conclude that there is a derivation of w in Gi, so ш is in L(Gi).

(If) Suppose w is in L{Gi). Then there is a parse tree in Gi, with S at the root and yield w. We convert this tree to a parse tree of G2 that also has root S and yield w.

First, we undo part (b) of the CNF construction. That is, suppose there is a node labeled A, with two children labeled Bi and Ci, where Gi is one of the variables introduced in part (b). Then this portion of the parse tree must look like Fig. 7.4(a). That is, because these introduced variables each have only one production, there is only one way that they can appear, and all the variables introduced to handle the production A -Bjt must appear together,

as shown.

Any such cluster of nodes in the parse tree may be replaced by the production that they represent. The parse-tree transformation is suggested by Fig. 7.4(b).

The resulting parse tree is still not necessarily a parse tree of G2. The reason is that step (a) in the CNF construction introduced other variables that derive single terminals. However, we can identify these hi the current parse tree and replace a node labeled by such a variable A and its one cliild labeled a, by a single node labeled c. Now, every interior node of the parse tree forms a production of G2. Since w is the yield of a parse tree in G2, we conclude that w is in L(G2). П





1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 [ 88 ] 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175