Строительный блокнот Automata methods and madness 5.5. SUMMARY OF CHAPTER 5 215 S AlB A -> OA 1 e В OB I IB I e a) Show that this grammar is unambiguous. b) Find a grammar for the same language that is ambiguous, and demonstrate its ambiguity. *! Exercise 5.4.6: Is your grammar from Exercise 5.1.5 unambiguous? If not, redesign it to be unambiguous. Exercise 5.4.7: The following grammar generates prefix expressions with operands x and у and binary operators -, and *: E +EE \ *EE\ -EE\x\y a) Find leftmost and rightmost derivations, and a derivation tree for the string +*-xyxy. ! b) Prove that this grammar is unambiguous. 5.5 Summary of Chapter 5 Context-Free Grammars: A CFG is a way of describing languages by recursive rules called productions. A CFG consists of a set of variables, a set of terminal symbols, and a start variable, as well as the productions. Each production consists of a head variable and a body consisting of a string of zero or more variables and/or terminals. 4- Derivations and Languages: Beginning with the start symbol, we derive terminal strings by repeatedly replacing a variable by the body of some production with that тапаЬ1е in the head. The language of the CFG is the set of terminal strings we can so derive; it is called a context-free language. ♦ Leftmost and Rightmost Derivations: If we always replace the leftmost (resp. rightmost) variable in a string, then the resulting derivation is a leftmost (resp. rightmost) deri-vation. Every string in the language of a CFG has at least one leftmost and at least one rightmost derivation. ♦ Sentential Forms: Any step in a derivation is a string of variables and/or torminai.4. We call such a string a sentential form. If the derivation is leftmost (resp. rightmost), then the string is a left- (resp. right-) sentential form. -f Parse Trees: A parse tree is a tree tliat siiows the essentials of a derivation. Interion nodes are labeled by variables, and leaves are labeled by terminals or e. For each internal node, there must be a production such that the head of the production is the label of the node, and the labels of its children, read from left to right, form the body of that production, -♦ Equivalence of Parse Ti-ees and Derivations: A terminal string is in the language of a grammar if and only if it is the yield of at least one parse tree. Thus, the existence of leftmost derivations, rightmost derivations, and parse trees are equivalent conditions that each define exacth the strings in the language of a CFG, f Ambiguous Grammars: For some CFGs, it is possible to find a terminal string with more than one parse tree, or equivalently, more than one leftmost derivation or more than one rightmost derivation. Such a grammar is called ambiguous. Eliminating Ambiguity: For many useful grammars, such as those that describe the structure of programs in a typical programming language, it is possible to find an unambiguous grammar that generates the same language. Unfortunately, the unambiguous grammar is frequently more complex than the simplest ambiguous grammar for the language. There are also some context-free languages, usually quite contrived, that are inherently ambiguous, meaning that every grammar for that language is ambiguous. f Parsers: The context-free grammar is an essential concept for the implementation of compilers and other programming-language processors. Tools such as YACC take a CFG as input and produce a parser, the component of a compiler that deduces the structure of the program being compiled. Document Type Definitions: The emerging XML standard for sharing information through Web documents has a notation, called the DTD, for describing the structure of such documents, through the nesting of semantic tags within the document. The DTD is in essence a context-free grammar whose language is a class of related documents, 5.6 References for Chapter 5 The context-free grammar was first proposed as a description method for natural languages by Chomsky [4]. A sinular idea was used shortly thereafter to describe computer languages - Fortran by Backus [2] and Algol by Naur [7]. As a result, CFGs are sometimes referred to as Backus-Naur form grammars. Ambiguity in grammars was identified as a problem by Cantor [3] and Floyd [5] at about the same time. Inherent ambiguity was first addressed by Gross [6]. 5.6. REFERENCES FOR CHAPTER 5 217 For applications of CFGs in compilers, see [1]. DTDs are defined in the standards document for XML [8]. 1. A. V. Aho, R. Sethi, and J. D. Ullman, Compilers: Principles, Tedmiques, and Tools, Addison-Weslcy, Reading MA, 1986. 2. J. W. Backus, The syntax and semantics of the proposed international algebraic language of the Zurich ACM-GAMM conference, Proc. Intl. Conf. on Information Processing (1959), UNESCO, pp. 125-132. 3. D. C. Cantor, On the ambiguity problem of Backus systems, J. ACM 9:4 (1962), pp. 477-479, 4. N. Chomsky, Three models for the description of language, IRE Trans, on Information Theory 2:3 (1956), pp, 113-124, 5. R. W. Floyd, On ambiguity in phrase-structure languages. Comm. ACM 5:10 (1962), pp. 526-534, 6. M, Gross, Inherent ambiguity of minimal linear grammars, Information and Control 7:3 (1964), pp. 366-368. 7. P. Naur et al., Report on the algorithmic language .ALGOL 60, Comm. ACMS:5 (1960), pp. 299-314, See also Comm. АСМ6Л (1963), pp. 1-17. 8. World-Wide-Web Consortium, http: www.w3.org/TR/REC-xml (1998),
|