Строительный блокнот  Automata methods and madness 

1 [ 2 ] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175

Start


Figure 1.2: Л finite automaton modeling recogniti<m of then

In Fig. 1.2, the five states are named by the prefix of then seen so far. Inputs correspond to letters. We may imagine that the lexical analyzer examines one charact(;r of the program that it is c<jmpiling at a time, and the next character to be examined is the input to the automaton. The stait state corresponds to the enii)ty string, and each state lias a transition on the next letter of then to the state tliat correKp{jnds to the iioxt-larger prefix. The; state named then is entered when the input has spelled the word then. Since it is the job of this automaton to recognize when then has been seen, we could consider that state the lone aicepting state. □

1.1.2 Structural Representations

There are two important notations that are not automaton-like, but plaj- an important role in the study of automata and their applications.

1. Grammar.4 are useful models when designing software tliat processes data with a recursive structure. The best-known example is a parser, the component of a compiler that deals with the recursively nested features of thii typical programming language, snc:h as expressions - arithmetic, conditional, and so on. For instance, a grammatical rule like E E + E states that an expression can be formed by taking any two expressions and connecting them by a plus sign; this rnie is typical of how expressions of real programming languages are formed. We introduce context-free grammars, as they are usually called, in Chapter 5.

2, Rdfjular Expressions also denote the structure of data, especially text strings. As we shall see in Chapter 3, the patterns of strings they describe are exactly tlie same as what can be described by finite automata. The style of these expressions differs significantly from that of grammars, and we shall content ourselves with a simple example here. The UNIX-style regular expression [A-Z] [a-z]*[ ] [A-Z] [A-Z] represents capitalized words followctl by a space and two capital letter. ,. This expression represents patterns in text that could be a city and state, e.g., Ithaca NY. It misses multiword city names, such as Palo Alto CA, which could be captured by the more complex expression

the keyword then. It thus needs five states, each of which represents a different position in the word then that has been reached so far. These positions correspond to the prefixes of the word, ranging from the empty string (i.e., nothing of the word has hern seen so far) to the complete word.



1.2 Introduction to Formal Proof

If you studied plane geometry in high school any time before the 1990s, you most likely had to do some detailed deductive jjroofs, where you showed the truth of a statement by a detailed sequence of steps and reasons. While geometry has its practical side {e.g., you need to know the rule for computing the area of a rectangle if you need to buy the correct amount of carpet for a room), the study of formal proof methodologies wa.s at least as important a reason for covering this branch of mathematics in high school.

In the USA of the 1990s it became popular to teach proof as a matter of personal feelings about the statement. While it is good to feel the truth of a statement you need to use, important techniques of proof are no longer mastered in high school. Yet proof is something that every computer scientist needs to understand. Some computer scientists take the extreme view that a formal proof of the correctness of a program should go hand-in-hand with the writing of the program itself. We doubt that doing so is productiл e. On the other hand, there are those who say that proof has no place in the discipline of programming. The slogan if yon are not sure your program is correct, run it and see is commonly offered by this camp.

(CA-Z][a-z]*[])*[][A-Z] [A-Z]

When interpreting sucli expressions, we only need to know that [A-z] represents a range of cliaracters from capital A to capital Z (i.e., any capital letter), and [ ] is used to represent the Ыалк character alone. Also, the symbol * represents any number of the preceding expression. Parentheses are used to group components of the expression; they do not represent characters of the text described.

1Д.З Automata and Complexity

Automata are essential for the study of the limits of computation. As we mentioned in the introduction to the chapter, there are two important issues:

1. What can a computer do at all? This study is called decidability, and the problems that can be solved by computer are called decidable, This topic is addressed in Chapter 9.

2. WTiat can a computer do efficiently? This study is called intractability, and the problems that can be solved by a computer using no more time than some slowly growing function of the size of the input are called tractable. Often, wo take all polynomial functions to be slowly growing, while functions that grow faster than any polynomial are deemed to grow too fast. The subject is studied in Chapter 10.



Our position is between these two extremes. Testing programs is surely essential. However, testing goes only so far, since you cannot try your program on every input. More importantly, if your program is complex - say a tricky recursion or iteration - then if you dont understand what is going on as you go {ц-ound a loop or call a function recursively, h is unhkely that you will write the code correctly. When your testing tells you the code is incorrect, you still need to get it right.

To make your iteration or recursion correct, you need to sot up an inductive hypothesis, and it is helpful to niason, formally or informally, that the hypothesis is consistent with the iteration or recursion. This process of understanding the workings of a correct program is essentially the same as the process of proving theorems by induction. Thus, in addition to giving you models that are useful for certain types of software, it has become traditional for a course on automata theory to cover nuithodologies of formal proof. Perhaps more than other core subjects of computer science, automata theory lends itself to natural and uiteresting proofs, both of the deductive kind (a sequence of justified steps) an<l the inductive kind (recursive proofs of a parameterized statement that use the stattmeiit itself with lower values of the parameter).

1.2.1 Deductive Proofs

As mentioned above, a deductive proof consists of a sequence of statements whos(> tiuth leads us from some initial statement, called the hypothesis or the given state7nent($), to a conclusion statement. Each step in the proof must follow, by some accepted logical principle, from either the given facts, or some of the previous statements in the deductive proof, or a combination of these.

The hypothesis may be true or false, typically depending on values of its paramcKTs. Often, the hypothesis consists of several independent statements connected by a logical AND. In those ca,ses, we talk of each of these statements as a hypothesis, or as a giveri statement.

The theorem that is provtd when we go from a hypothesis Я to a conclusion С is the statement if Я then C. We say that С is deduced from Я, An example theorem of the form if Я then С will illustrate these points.

Theorem 1.3: If x > 4, then 2* > x. □

It is not hard to convince ourselves informally that Theorem 1.3 is true, although a formal proof requires induction and will be left for Example 1.17. First, notice that the hypothesis Я is ж > 4. This liypothesis has a parameter, J, and thus is neither true nor false. Rather, its truth depends on the ralue of the parameter x; e.g., Я is true for a; = 6 and false for x = 2.

Likewise, the conclusion С is 2 > , This statement also uses parameter j; and is true for certain values of x and not others. For example, С is fitlse for I - 3, since 2* = 8, which is not as large as 3 = 9. On the other hand, С is true for .T = 4, since 2 - 4 = 16, For x - 5, the statement is also true, since 2 = 32 i,4 at least as large as 5 = 25,



1 [ 2 ] 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175