Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [ 31 ] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

k-Vi

path is in the language of r]j

2. The path goes throngh state к at least once. Ihen we can break tlie path into several pieces, as suggested by Fig, 3,3. The first goes from state i to state к without passing through k, the last piece goes from к to j without pa,Hsing through k, and all the pieces in the middle go from к to itself, without passing through k. Xote that if the path goes through state к only once, tlien thtire are no middle pieces, just a path from i to к and a patli from к to j. The set of labels for all paths of this type is repres(!nted by the regular expnission Л\1~ииГУRf That is, the first ( xpression represents the part of th( path that gets to state к the first time, the second represents the portion that goes from к to itsc;lf, zero thnes, once, or more than once, and the third expression represents the part of tlie path that leaves к for the last time and goes to state j.

Zero or more stnngs m Л

Figurt! 3,3: A path frcnn i to j can be broken into segments at each point where it goes through state- к

Wh(m we comljine tin; expressions for the paths of the two types above, we have the expression

for the labels (jf all paths from state / to state j that go through no state higher than k. If we construct these expressions in order of increasing superscript, then since each r\ depends only on expressions with a smaller superscript, then all expression,s are available when we n<;od tliem.

Eventually, we have for all i and j. We may assume that state 1 is the start state, although the accepting states could be any set of the states. The regular expression for the language of the automaton is then the sum (union) of all expressions Kj such that state j is an accepting state. □

Example 3.5: Let us convert the DFA of Fig. 3.4 to a regular expression. This DFA accepts nil strings that have at least one 0 in them. To see why, note that the automaton goes from the start state 1 to accepting state 2 as soon as it sees an input 0. The automaton then stays in state 2 on all input sequences. Below are the basis expressitms in the construction of Theorem 3.4.

1. The path does not, go throngh state к at all. In this case, the label of the



chapter 3. regular expressions and languages 1

Start

0 z?-


Figure 3.4: A DFA accepting all strings that have at least one 0

e + 1

d{0) 12

оЮ) 21

o(0) П-22

(e + 0 + 1)

For instance, i? has the term с because the beginning and ending states are the same, state I. It has the term 1 because there is an arc from state 1 to state 1 on input 1. As another example, в! because there is an arc labeled 0 from state 1 to state 2. There is no e term because the beginning and ending stattjs are different. For a third example, Д = 0, because there is no arc from state 2 to state 1.

Now, we must do the induction part, building more complex expressions that first take into account paths that go through state 1, and then paths that can go tlirough states 1 and 2, i.e., any path. The rule for computing the expressions are uistances of the general rule given in the in(uctiлe part of Theorem 3.4:

йУ = <Чд1Г )*< (3.1)

The table in Fig. 3.5 gives fir.st the expressions computed by direct substitution into the abo\e formula, and then a simplified (expression that we can show, by ad-hoc rca.soning, to represent the same language as the more complex expression.

By direct substitution

Simplified

d(1) - 12

еЧ-1-Ь(е-Ы)(е-1-1)*(е-}-1) 0 + (б-И)(е + 1)*0 0-b0(e-f l)-(e + l) e + O + l + 0(e + l)*O

1* 1*0

e + 0 + 1

Figure 3.5: Regular expressions for paths that can go through only state 1

For example, consider r\]} . Its expression is Л + r°j* {ry r°2, which we get from (3.1) by substituting i - 1 and j - 2.

To understand the simplification, note the general principle that if r is any regular expression, then {€ + r)* - r*. The justification is that both sides of



the equation describe the language consisting of any concatenation of zero or more strings from L{R). In our case, we have (e -f 1)* =1*; notice that both expressions denote any number of Is. Further, (e + l)l* = 1*- Again, it can be observed that l)Oth expressions denote any number of Is. Thus, the original expression R equivalent to 0 + 1*0. This expression denotes the language containing the string 0 and all strings consisting of a 0 preceded by any number of Is. This language is also expressed by tlie simpler expression 1*0.

The simplification of R[\ is similar to the simplification of л[У that we just considered. The simplification of i?2i and R2 depends on two rules about how 0 operates. For any regular expression R:

1. 0Й = Я0 - 0. That is, 0 is an annihilator for concatenation; it results in itself when concatenated, either on the left or right, with any expression. This rule makes sense, because for a string to be in the result of a concatenation, we must find strings from both arguments of the concatenation. Whenever one of the arguments is 0, it will be unpossible to find a string from that argument.

2. 0-1-Я-Л + 0 - Я. That is, 0 is the identity for union; it results in the other expression whenever it appears in a union.

As a result, an expression like 0{б + l)*(e + 1) can be replaced by 0. The la.st two simplifications should now be clear.

Now, let us compute the expressions r!\ The uiductive rule applied with к = 2 gives us:

HRTRy (3.2)

If we substitute the simplified expressions bom Fig. Ъ.Ъ into [Ъ!Т), we gftl fhe expressions of Fig. 3.6. That figure also shows simplifications following the same principles that we described for Fig. 3.5.

By direct substitution

Simplified

pt2)

1* + l*O(e + O--l)*0

1*0 + l*0(e-b 0 + 1)*(б + 0 4-1)

0 + (e + O + l)(e-t-O + l)*0

e-l-0 + l + (f + 0 + l)(e + 0-l- l)*(e + 0 + 1)

1*0(0 + 1)*

(o + D*

Figure 3.6: Regular- expressions for paths that can go through any state

The final regular expression equivalent to the automaton of Fig. 3.4 is constructed by taking the union of all the expressions where the first state is the start state and the second state is accepting. In this example, witli 1 as the start state and 2 as the only accepting state, we need only the expression я[.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 [ 31 ] 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175