Строительный блокнот  Automata methods and madness 

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 [ 46 ] 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

4.2.3 Homomorphisms

A string homomorphism is a function on strings that works by substituting a particular string for each symbol.

Example 4.13: The function h defined by h{0) = ab and /i(l) - e is a homomorphism. Given any string of Os and Is, it replaces all Os by the string ab and replaces all Is by the empty string. For example, h applied to the string 0011 is abab. □

Formally, if /i is a homomorphism on alphabet L, and w - я1Я2-- п is a string of symbols in S, then h{w) - h{ai)h{a2) Нпп) That is, we apply h to each symbol of w and concatenate the results, in order. For instance, if h is the homomorphism in Example 4.13, and w = 0011, then hiw) = h{0)h{0)k{l)hil) = (аЬ)(аЬ)(е)(б) = abab, as we claimed in that example.

Further, we can apply a homomorphism to a language by applying it to each of the strings in the language. That is, if L is a language over alphabet S, and h is a homomorphism on E, then h{L) = {h{w) \ id is in L}. For instance, if L is the language of regular expression 10*1, i.e., any rmniber of Os surrounded by single Is, then h{L) is the language (ab)*. The reason is that h of Example 4.13 effectively drops the Is, since they are replaced by e, and turns each 0 into ab. The same idea, applying the homomorphism directly to the regular expression, can be used to prove that the regular languages are closed under homomorphisms.

Theorem 4.14: If L is a regular langnage over alphabet S, and /i is a homomorphism on E, then h{L) is also regular.

PROOF: Let L ~ L{R) for some regular expression R. In general, if F is a regular expression with symbols in E, let h{E) be the expression we obtain by replacing each symbol a of E in E by h{a). We claini that h{R) defines the language h{L).

The proof is an easy structural induction that says whenever we take a subexpression E oi R and apply h to it to get / (£), the langnage of h{E) is the same language we get if we apply h to the language L{E). Formally, L{h{E))h{L{E)).

BASIS: If £ is e or 0, then h{E) is the same as E, since h does not affect the string £ or the language 0. Thus, L[h{E)) - L{E). However, if F is 0 or б, then L{E) contains either no strings or a string with no symbols, respectively. Thus h{L[E)) = L{E) in either case. We conclude L{h{E)) = L{E) - h{L{E)).

The only other basis case is if £ = a for some symbol о in S. bi this case, L{E) = [a], so h[L{E)) - {h{a)]. Also, h{E) is the regular expression that is the string of symbols h{a). Thus, L{h{E)) is also {Л(а)}, and we conclude L{h{E))h[L{E)).



That. 7 should be thought of as a Greek capital tau, the letter following sigma.

INDUCTION: Tliere are three cases, each of them simple. We shall prove only the union case, where E = F + G. The way we apply homomorphisms to regular expressions assures us that h(E) = h{F + G) = h{F) + h{G). Wo also know that L(E) = L[F) U L{C) and

L{h(E)) = L{h{F) + hiG)) = L{h{F)) U L{hiG]) (4.2)

by the definition of what means in regular expressions. Finally,

h{LiE)) = h{L{F) и L(G)) = /i(L(F)) U h{L{G)) (4.3)

because h is applied to a language by application to each of its strings individually. Now we may invoke the inductive hypothesis to assert that L{h(F)) - h{L{F)) and L{h{G)] = h{HG)). Thus, the final expressions in (4.2) and (4.3) are equivalent, and therefore so are their respective first terms; that is, L{h{E))=h{L{E)).

We shall not prove the caes where expression £J is a concatenation or closure: the ideas are similar to the above in both c:ases. The conclusion is that L{h(R)) is indeed h[L{R)); i.e., applying the lionioinorphism h to the regular expression for language L results in a regular expression that defines the language h{L). □

4.2.4 Inverse Homomorphisms

Homomorphisms may also be apphed backwards, and in this mode they also preserve regular languages. That is, suppose h is a homomorphism from some alphabet S to strings in another (possibly the same) alphabet T. Let L be a language over alphabet T. Then h~{L), read h inverse of L, Is the set of strings w in S such that h{w) is in L. Figure 4.5 suggests the effect of a homomorphism on a language L in part (a), mid the effect of an inverse homomorphism in part (b).

Example 4.15 : Let L be the language of regular expression (00 -h 1)*. That is, L consists of all strings of Os and Is such that all the Os occur in adjacent pairs. Thus, 0010011 and 10000111 are in L, but ООО and 10100 are not.

Let h be the homomorphism defined by h{a) = 01 and Ji{b) - 10. We claim that h~{L) is the language of regular expression (ba), that is, all strings of repeating ba pairs. We shall prove that h{-w) is in L if and only if w is of the form baba --ba.

(If) Suppose w is n repetitions of ba for some n > 0. Note that Н{Ьа) = lOOl, so h{w) is 77, repetitions of 1001. Since 1001 is composed of two Is and a pair of Os, we know that 1001 is in L. Therefore any repetition of 1001 is also formed from 1 and 00 segments and is in L. Thus, h{w) is in L.





Figure 4.5: A homomorphism applied in the forward aud inverse direction

(Only-if) Now, we must assume that h{w) is in L and show that u; is of the form baba - ba. There are four conditions under which a string is not of that form, and we shall show that if any of them hold then h{w) is not in L. That is, we prove the contrapositive of the statement we set out to prove.

1. If id begins with a, then h{vj) begins with 01. It therefore has an isolated 0, and is not in L.

2. If w ends in then h{w) ends in 10, and again there is an isolated 0 in hiw).

3. If w has two consecutive as, then /г( ;) has a substring 0101. Here too, there is an isolated 0 in w.

4. Likewise, if w has two consecutive bs, then h{ii)) has substring 1010 and has an isolated 0,

Thus, whenever one of the above cases hold, h{w) is not in L. However, unless at least one of items (1) through (4) hold, then w is of the form baba --ba.



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 [ 46 ] 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175