Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 [ 57 ] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

Compact Notation for Productions

It is convenient to think of a production as laelonging to tlie variable of its head. We shall often use remarks like the productions for A or Л-prodnctions to refer to the productions whose head is variable A. We may write the productions for a grammar by listing each variable once, and then listing all the bodies of the productions for that variable, separated by vertical bars. That is. the producticms A a], A a,..., A a can be replaced by the notation Л aj jal lOp. For instance, the grammar for palindromes from Fig. 5.1 can be written as P e 0 1 OFO IPl.

that if we take any expression and put matching parenth(;Kes around it, the result is also an expression.

Rules (5) through (10) describe identifiers /. The basis is rules (5) and (6); they say that n and b are identifiers. The remaining four rules are the inductive case. They say that if we have any identifier, we can follow it by a, b, 0, or 1, and the result will be another identifier, □

5.1.3 Derivations Using a Grammar

We apply the productions of a CFG to infer that certain strings are in the language of a certain variable. There are two approaches to this inference. The more conventional approach is to use the rules from body to head. That is, we take strings known to be in the language of each of the variables of the body, concatenate them, in the proper order, with any terminals ajpearing in the body, and infer that the resulting string is in the language of the \-ariable in the head. We shall refer to this procedure as recursive inference.

There is another approach to defining the language of a grarrnnar, in which we use the productions from head to body. We expand tlie start symbol using one of its productions (i,e using a production whose head is the start symbol). We further expand the resulting string by replacing one of the variables by tlie body of one of its productions, and so on, until we derive a string ctmsisting entirely of terminals. The language of the grammar is aU strings of terminals that we can obtain in this way. This use of grannnars is called derivation.

We sliall begin with an example of the first approach - recursive inference. However, it is often more natural to think of grammars эл used in derivations, and we shall next develop the notation for describing these derivations.

Example 5.4: Let us consider some of the inferences we can make using the grammar for expressions in Fig. 5.2. Figure 5.3 summarizes these inferences. For example, fine {i) says that we can infer string a is in the language for I by using production 5. Lines {ii) through {iv) say we can infer that feOQ



String

For lang-

Production

String(s)

Inferred

uage of

used

used

(ii)

{iii)

iiv)

{in)

{iv)

[vii]

a + bOO

{vl{vi)

{viii)

(a + 600)

(vii)

(;ix)

и * (a + 600)

{v), {viii)

Figure 5.3: Inferring strings using the grammar of Fig. 5.2

Lines (v) and {vi) exploit production 1 to infer that, since any identifier is an expression, the strings a and 600, which we inferred in lines (i) and {iv) to be identifiers, are also in the language of variable E. Line {vii) uses production 2 to infer that the sum of these identifiers is an expression; line (viii) uses production 4 to infer that the same string with i)arontheses around it is also an expression, and line {ix) uses production 3 to multiply the identifier a by the expression we had discovered in \mc {viii). □

The process of deriving strings by applying productions from bead to body requires the definition of a new relation symbol . Suppose G - (У, T, P, S) is a CFG. Let aA8 be a string of terminals and variables, with A a variable. That is, a and 8 are strings in (V U T), and .4 is in V. Let Л 7 be a production of G. Then we say a Ad => a-yp. If G is understood, we just say aA3 ay 8.

Notice that one derivation step replaces any variable anywhere in the string by the body of one of its productions.

We may extend the relationship to repres(;nt zero, one, or many derivation steps, much as the transition function 5 of a finite autonmton was extended to 6. For derivations, we use a * to denote zero or more steps, as follows:

BASIS: For any string a of terminals and variables, we say a 4> Q-. That is, any string derives itself.

INDUCTION: If a 8 and 7, then a 7. That is, if a can become /3

О G a

by zero or more steps, and one more step takes 8 to 7, then a can become 7.

Put another way, a => 8 means that there is a sequence of strings 71,73, -. , 7 ,

for some n> 1. such that

1, a = 71,

is an identifier by using production 6 once (to get the 6) and then applying production 9 twice (to attach the two Os).



5.1.4 Leftmost and Rightmost Derivations

In order to restrict the number of choices we have in d(niving a string, it is often useful to require that at each step we replace the leftmost variable; by one of its production bodies. Such a derivation is called a laftmost derivation, and we indicate that a derivation is leftmost by using the relations => and => , for

Jm Im

one or many steps, respectively. If the grammar G that is being used is not obvious, we can place the narne G beloM the arrow in either of these symbols.

Similarly, it is possible to require that at each step the rightmost variable is replaced by one of its bodies. If so, we call the derivation rightmost and use

2. .в ~ On- and

3. For i = 1, 2 .., тг - 1, we have 7,- 7i+i.

If grammar G is understood, then we use h- in place of .

Example 5-5 ; The inference that a * (a -I- 600) is in the language of variable E can be reflected in a derivation of that string, starting w-ith the string E. Here is one such derition:

EE*El*Eai<E

о * (E) * (£ + F) a * (/ + E) a * (o + E)

о * (a + 7) a * (o + /0) о * (a + 700) a * (a + 600)

At the first step, E is replaced by the body of production 3 (from Fig. 5.2). At the second step, production 1 is used to replace the first E by 7, and so on. Notice that we have systematicaUy adopted the policy of always replacing the leftmost variable in the string. Hoover, at each step we may choose which variable to replace, and we can use any of the productions for that rariable. For instance, at the second step, we could have replaced the second E by (£), using production 4. In that case, we would say E*E E*{E). We could also have chosen to make a replacement that would fail to lead to the same string of terminals. A simple example would be if we used production 2 at the first step, and said E E E. No replacements for the two Es could ever turn £ + Einto А*(н + №0).

We can use the => relationship to condense the derivation. We know E E by the basis. Repeated use of the inductive part gives us E => E*E, E I*E, and so on, until finally E u * (a + 600).

The two viewpoints - recursive inference and derivation are equivalent. That is, a string of terminals w is inferred to be in the language of some variable A if and only if A w. However, the proof of this fact requires some work, and we leave it to Section 5.2. □



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 [ 57 ] 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175