Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 [ 49 ] 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

II Exercise 4.2.11: Show that the regular languages are closed under the following operation: cyde{L) = {w \ we can write w as w = xy, such that yx is in L}. For example, if L = {01,011}, then cyde{L) {01,10,011,110,101}. Hint: Start with a DFA for L and construct an e-NFA for cycle(L).

!1 Exercise 4.2.12: Let Wi - aoOoOi, and wi ~ Wi-Wi-ai for all i > 1. For instance, - aoOoOioOoOiaaQoOoaiaoaoaia2a3. The shortest regular expression for the language Ln - {wn], i.e., the language consisting of the one string wn, is the string wn itself, and the length of this expression is 2 * - 1. However, if we allow the intersection operator, we can w-rite an expression for Ln whose length is Find such an expression. Hint: Find n languages,

each with regular expressions of length 0{n), whose intersection is L .

! Exercise 4.2.13: We can use closure properties to help prove certain lan-gnages are not regular. Start with the fact that the language

Lo in-{0 l n>0}

is not a regular set. Prove the following languages not to be regular by transforming them, using operations known to preserve regularitj, to Lonin:

* a) {OU I ii).

b) 04 2 - I n > m > 0}.

Exercise 4.2.14: In Theorem 4.8, we described the product construction that took two DFAs and constructed one DFA whose language is the intersection of the languages of the first two.

a) Show how to perform the product construction on NFAs (without e-transitions).

! b) Show how to perform the product construction on e-NFAs.

* c) Show how to modify the product construction so the rtsulting DF.A ac-

cepts the differenct; of the languages of the two given DFAs.

d) Show how to modify the product construction so the resulting DFA accepts the union of the languages of the two given DFAs.

Exercise 4.2,15: In the proof of Theorem 4.8 we claimed that it could be proved by induction on the length of w that

Give this inductive proof.

Exercise 4.2.16: Complete the proof of Theorem 4.14 by considering the cases where expression E is д. concatenation of two subexpressions and where E is the closure of an expre-ssion.

Exercise 4,2.17: In Theorem 4.16, we omitted a proof by induction on the length oiw that {q{i,w) - 6[q,h{w)). Prove this statement.



4.3 Decision Properties of Regular Languages

In this section we consider how one answers important questions about regular languages. First, we must consider what it means to ask a question about a language. The typical language is infinite, so you cannot present the strings of the language to someone and ask a question that requires them to inspect the infinite set of strings. Rather, we present a language by giving one of the finite representations for it that we have developed: a DFA, an NFA, an e-NFA, or a regular expression.

Of course the language so described will be regular, and in fact there is no way at all to represent completely arbitrary languages. In later c:hapters wc shall see finite ways to represent more than the regular languages, so we can consider questions about languages in these more general classes. However, for many of the questions we ask. algoritlnns exist only for the class of regular languages. The same questions become undecidable (no algorithm to answer them exists) when posed using more expressive notations (i.e.. notations that can be used to express a larger set of languages) than the representations we have developed for the regular languages.

We begin our study of algorithms for questions about regular languages by reviewing the ways we can convert one representation into another for the same language. In particular, we want to observe the time complexity of the algorithms that perform the conversions. We then consider some of the fundamental questions about languages:

1. Is the langiiage described empty?

2. Is a particular string w in the described language?

3. Do two descriptions of a language actually describe the same language? This question is often called equivalence of languages.

4.3.1 Converting Among Representations

We know that we can convert any o£ the four representations for regular languages to any of the other three representations. Fignre 3.1 gave paths from any representation to any of the others. While there are algorithms for any of the conversions, sometimes we are interested not only in the possibility of making a conversion, but in the amount of time it takes. In particular, it is important to distinguish between algorithms that take exponential time (as a fimction of the size of their input), and therefore can be performed only for relatiлely small instances, from those that take time that is a linear, quadratic, or some small-degree polynomial of their input size. The latter algorithms are realistic, in the sense that we expect them to be executable for large instances of the problem. We shall consider the time complexity of each of the conversions we discussed.



Vnr h Llisciission of traiisil,iv( rlosiirp algorithms, ясг A, V, Aho, .1, E. Hopcroft, я.ы\ ,1. D, Uilniftii, Dnla Structures and Alqariihrns, Addison-VVealny, 1П81.

Converting NFAs to DFAs

When we start with either an NFA or and e-NFA and convert it to a DFA, the tune can l)e exponential in the number of states of the NFA. First, computing the e-closure of n states takes 0{n) time. We must search from each of the n states along all arcs labeled e. If there are n states, there can be no more than n arcs. Judicious bookkeeping and well-designed data structures will make sure that we can explore from each state in 0{n) time. In fact, a transitive closure algorithm such as Warshalls algorithm can be used to compute the entire e-closure at once.

Once the e-closure is computed, we can compute the equialent DFA by the subset construction. The dominant cost is, in principle, the number of states of the DFA, which can be 2 . For each state, we can compute the transitions in 0(n) time by consulting the e-closure information and the NFAs transition table for each of the input symbols. That is, suppose we want to compute ({91 tQ-2t ,Qk},a) for the DFA. There may be as many as n states reachable from each (/., along e-labeled paths, and each of those states may have up to n arcs labeled a. By creating an array indexed by states, we can compute the union of up to n sets of up to n states in time proportional to n.

In this way, we can compute, for each qi, the set of states reachable from qi along a path labeled a (possibly including es). Since к < ri, there are at most n states to deal with. We compute the reachable states for each in 0{n) time. Thus, the total time spent computing reachable states is 0{n). The union of the sets of reaciiable states requires only 0{n) additional time, and we conclude that the computation of one DFA transition takes O(n) time.

Note that the number of input symbols is assumed constant, and does not depend on n. Thus, in this and other estimates of running time, we do not consider the number of input symbols as a factor. The size of the input alphabet influences the constant factor that is hidden in the big-oh notation, but nothing more.

Our conclusion is that the running time of NFA-to-DFA conversion, including the ea-se where the NFA has e-transitions, is 0(n2 ). Of course in practice it is common that the munber of states created is much less than 2 , often only n states. We could state the hound on the running time as 0{пя], where s is the number of states the DF.A actually has,

DFA-to-NFA Conversion

This conversion is simple, and takes 0(n) time on an n-state DFA, All that we need to do is modify the transition table for the DFA by putting set-brackets around states and, if the output is an e-NFA, adding a column for e. Siucv we treat the number of input symbols (i.e., the width of the transition tabk;) as a constant, copying and processing the table takes 0{n) time.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 [ 49 ] 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175