240 likes | 252 Views
Explore the fundamentals of grammars and parsing in computer science, from syntax analysis to parsing strategies. Learn about derivations, syntax trees, and parsing algorithms. Dive into different types of grammars and their applications.
E N D
991022主要工作 • Review the last lecture • Motivation and approach • 課本第三章 國立政治大學資訊科學系
Chapter 3 Grammars and Parsing
S NP VP NAME V NP John ate ART N the cat Trees • Nodes • Links • Root • Leaves • Parent • Child • Ancestor • Dominated 國立政治大學資訊科學系
S NP VP NAME V NP John ate ART N the cat A Simple Grammar • S :- NP VP • VP :- V NP • NP :- NAME • NP :- ART N • NAME :- John • V :- ate • ART :- the • N :- cat 國立政治大學資訊科學系
Grammars • G = (V, T, P, S) • A set of rewrite rules • Mother: left-hand side (lhs) • Terminal and nonterminal symbols • Lexical symbols • Derivation S :- NP VP 5. NAME :- John VP :- V NP 6. V :- ate NP :- NAME 7. ART :- the NP :- ART N 8. N :- cat 國立政治大學資訊科學系
Left-Most Derivation S => NP VP => NAME VP => John VP => John V NP => John ate NP => John ate ART N => John ate the N => John ate the cat S => NP VP => NAME VP => John VP => John V NP => John ate NP => John ate ART N => John ate the N => John ate the cat This derivation process is “left-most” because we always expand the left-most symbol at the next step. 國立政治大學資訊科學系
Parsing • Generation (derivation) • From the start symbol (S) to sentences • Not necessarily to follow a particular order • Parsing (reduction) • Identify structures of sentences, given a grammar • Top-down strategy • Bottom-up strategy 國立政治大學資訊科學系
A Bottom-Up Parsing Strategy S <= NP VP <= NP V NP <= NP V ART N <= NAME V ART N <= NAME V ART cat <= NAME V the cat <= NAME ate the cat <= John ate the cat S <= NP VP <= NAME VP <= John VP <= John V NP <= John ate NP <= John ate ART N <= John ate the N <= John ate the cat 國立政治大學資訊科學系
Good Grammars • Generality • Selectivity • Understandability • Others? 國立政治大學資訊科學系
E -> T X X -> + T X | e T -> F Y Y -> * F Y | e F -> id E -> E + T | T T -> T * F | F F -> id Grammar 3.1 Grammar 3.2 Understandability 國立政治大學資訊科學系
E E T T T F F F a + b * c Grammar 3.1 E -> E + T | T T -> T * F | F F -> id 國立政治大學資訊科學系
E X T T Y F Y F F Y X a e + b * c e e Grammar 3.2 E -> T X X -> + T X | e T -> F Y Y -> * F Y | e F -> id 國立政治大學資訊科學系
E E E T X T T T T Y F F F F Y F F Y X a + b * c a e + b * c e e Comparison The parse tree depend on both the sentence and the grammar! 國立政治大學資訊科學系
E -> E + E | E * E | id Grammar 3.3 A Better Grammar? • More understandable • BUT… • Ambiguity • Trees for a + b * c 國立政治大學資訊科學系
Examining Your Grammar • Constituents • Methods • Conjunction • Substitution • Examples • Modifying existing grammar unavoidable! • Interaction among rules 國立政治大學資訊科學系
Chomsky Hierarchy • Regular grammar • Context-free grammar • Context-sensitive grammar • Type 0 國立政治大學資訊科學系
Regular Grammars • All productions are of the form A -> w B or A -> w, where w is a (possibly empty) string of terminals. (Hopcroft & Ullman 1979) • Notice the definition is not the same as that on page 46 in the textbook. • Examples • Why are regular grammars not sufficiently expressive. 國立政治大學資訊科學系
Context-Free Grammars • Each production is of the form A -> a, where A is a variable and a is a string of symbols from • Examples • Is CFG sufficiently expressive? 國立政治大學資訊科學系
Context-Sensitive Grammars • aAb -> aYb, where Y is a nonempty sequence of symbols • a -> b, where b is at least as long as a • Examples 國立政治大學資訊科學系
A Top-Down Parser • Try top-down parsing manually! • Share your methods. • A brute-force method in the textbook • A better methods in the textbook • Grammar + lexicon • Parse states: ((N VP) 2) and ((VP) 3) • Backtracking 國立政治大學資訊科學系
The Algorithm • Possibilities list • The current state • Backup states • Algorithm on page 49 • Examples on pages 49 and 51 國立政治大學資訊科學系
Parsing as a Search Procedure • Depth-first search • May go a long way before finding a blind valley • Tend to minimize bookkeeping • Breadth-first search • Example: Figure 3.7 • Which is better? • Left recursion 國立政治大學資訊科學系
A Bottom-Up Chart Parser • Interested in carry out bottom-up parsing manually? • A simple-minded method in the textbook (page 53) • The Algorithm • Dot notation • Active arcs • Agenda • Chart • Follow the trace 國立政治大學資訊科學系
Transition Networks • Finite state machines (FSMs) • FSMs and regular grammars • Recursive transition networks (RTNs) and CFGs • Arc labels for RTNs • The algorithm • Data structure • Examples 國立政治大學資訊科學系