340 likes | 463 Views
Week 5. Questions / Concerns What’s due: Lab2 part a due on Sunday Lab1 check-off by appointment Test#1 HW#5 next Thursday Coming up: Lab3 Posted. Discuss grading and homework Homework#4 Recursive Descent Parser Compute First and Follow. Revised & Expanded Grammar Example.
E N D
Week 5 • Questions / Concerns • What’s due: • Lab2 part a due on Sunday • Lab1 check-off by appointment • Test#1 HW#5 next Thursday • Coming up: • Lab3 Posted. • Discuss grading and homework • Homework#4 • Recursive Descent Parser • Compute First and Follow
Revised & Expanded Grammar Example S -> id = E ; E -> E + T | E – T | T T -> T * F | T / F | F F -> ( E ) | id S ; E id (i) = E T + T F T * F F id(c) id(a) id(b) i = a + b * c;
Sample Language • This following language consists of statements in 2 types: • Function definitions • Function calls • You don’t have to return values from a function as functions are evaluated to a result value. • Every function can have up to 2 parameters. (defun add (x y) (+ x y) ) (add 10 20) (add 20 20) (set x 10) (negate x) (+ x 10) (print x) (input x) Can you define a grammar for this language?
Sample Language Grammar Program -> Statement Statements Statements -> Statement Statements | Statement -> FunctionDef | FunctionCall FunctionCall -> (ID Param Param ) | (OP Param Param ) | (BUILTIN Param Param) Param -> ID | Num | BUILTIN -> setx | negate | print OP -> + | - | * | / FunctionDef -> (defun ID (FParam FParam) Statements) FParam -> ID |
HW#4 • Unit Production • S-> aX | Yb • X -> S remove X-> S, don’t change anything else • Y -> …. X-> aX | Yb • Left Factoring S-> a… | a… | a… => S -> aX
HW#4 • Left Recursion & Unit production • Remove Left Recursion first • If there are any unit productions left, take care of them afterwards S-> A | B | Sb | aSb | c After removal: X -> bX | S -> AX | BX | aSbX | cX A-> Ab | Bc | Sb | After removal: Y -> bY | A -> BcY | SbY | Y A -> BcY | SbY | bY | Unit production
Recursive Descent Parser bool A(){ if curToken == a //match aAbb if (A()) if (nextToken == b) if (nextToken == b) return true; else if curToken == b //match bB if (B()) return true; else if (B()) //trying to match Bcd A -> aAbb | bB | Bcd
Recursive Descent Parser bool A(){ if curToken == a //match aAbb if (A()) if (nextToken == b) if (nextToken == b) return true; else if curToken == b //match bB if (B()) return true; else if (B()) what if B starts with a or b? A -> aAbb | bB | Bcd
Recursive Descent Parser bool A(){ if curToken == a //match aAbb if (A()) if (nextToken == b) if (nextToken == b) return true; else if curToken == b //match bB if (B()) return true; else if (B()) You don’t have to remove B because it’s not a unit production but you may need to backtrack here if B starts with a or b also A -> aAbb | bB | Bcd
Recursive Descent Parser • If possible, try to remove non-terminals as the first symbol even though they are not unit productions. It reduces some backtracking. • Example: • E -> TX E -> idYX | (E)YX • T -> FY T -> idY | (E)Y • F -> id | ( E )
Recursive Descent Parser • (Rules without lambdas) • The rule will return false if • None of the first tokens from the rule matches the current token. S -> aSb | bS | cd The current token is not a, b or c • The first token matches but rest of the rule fails The input string for above grammar is “cc”.
Recursive Descent Parser • (Rules with lambdas) • What if: • None of the first tokens from the rule matches the current token. S -> aSb | bS | cd | The current token is not a, b or c. So the lambda means that S doesn’t really match any token, so just take lambda and return true. It lets the next rule to worry about the current token. What if the current token is wrong for the next rule? • The first token matches but rest of the rule fails The input string for above grammar is “cc”. But lambda could still mean that “cc” is intended for the next rule and that S should just be lambda. We would have to see what the next rule needs if “cc” is wrong here.
Recursive Descent Parser • The parser would be much more accurate and efficient if we know all the first tokens of a rule regardless if they start with a token or a non-terminal. S -> a S b | b B | C d • The parser would handle lambdas more accurately if it knows what tokens should come after the rule if lambda is taken. S -> a S b aaac… S -> First set Follow set You only need Follow set if you have lambdas
LL(1) • Another top-down parser • It’s a table-driven parser. • LL(1) • L – first L, the input is from left to right • L – second L, leftmost derivation (top-down) • 1 – one token look ahead • Grammar pre-req: • No left recursion • Unit productions are okay, but should minimize • MUST left factor to ensure one-token look ahead. • Procedure: • Compute First and Follow sets from the grammar
First set • Grammar with no lambda S -> A B S -> c A -> a bA A -> b B -> d B B -> e First set is just the first token. If the first symbol is a non-terminal, then include all the first tokens from that non-terminal.
First set • S -> A B S -> c A -> a bA A -> b B -> d B B -> e
First set • What about ? S -> A B S -> c A -> a bA A -> b A -> B -> d B B -> e Only add to S if S itself also goes to as the result of using that rule
First set • What about ? S -> AB S -> c A -> a bA A -> b A -> B -> d B B -> e Only add to S if S itself also goes to as the result of using that rule
First Set S -> ABc A -> a A -> B -> b B ->
Follow set • If any one grammar rule in your grammar has lambda productions, then you need to compute the follow set for the entire grammar. • Follow set is just a set of tokens that follow a particular non-terminal. • There are 4 different cases to consider in computing the Follow set. Not all 4 cases are applicable in every case.
Follow Set • Case 1: If S is the start symbol, then add $ (end of input) to the Follow set of S.Nothing should follow S after S is done parsing. • Case 2: If you have a token following a non-terminal on the right hand side of the rule, then add that token to the follow set of the non-terminal. … -> .. Ab- b follows A. Add b to the Follow set of A. Note: We don’t care what’s on the left hand side of the rule, just that b follows the non-terminal A
Follow Set • Case 3: If you have a non-terminal following a non-terminal on the right hand side of the rule, then add that First set of the second non-terminal to the follow set of the first non-terminal. … -> .. AB- B follows A. Add First(B) - to the Follow set of A. Note: We don’t care what’s on the left hand side of the rule, just that B follows the non-terminal A. We don’t have in the Follow set.
Follow Set • Case 4: If you have this following pattern: A -> ….. B This could be an unit production or just any rule that ends with a non-terminal. In this case, anything that follows A also follows B
Follow Set • Case 4: If you have this following pattern: A -> ….. B A B Whatever follows A , also follows B because there is nothing else after B on the lower level of the parser tree
Follow set: Case 1 • S -> A B c S -> c A -> a B A -> b B -> d B B -> e
Follow set: Case 2 • S -> A B c S -> c A -> a B A -> b B -> d B B -> e
Follow set: Case 3 • S -> A B c S -> c A -> a B A -> b B -> d B B -> e
Follow set: Case 4 • S -> A B c S -> c A -> a B A -> b B -> d B B -> e
Follow set: Case 1 S -> ABc A -> a A -> B -> b B ->
Follow set: Case 2 S -> ABc A -> a A -> B -> b B -> BUT, B could be a lambda
Follow set: Case 2 S -> ABc A -> a A -> B -> b B ->
Follow set: Case 3 S -> ABc A -> a A -> B -> b B -> No in Follow set
Follow set: Case 4 S -> ABc A -> a A -> B -> b B -> No case 4 for this grammar
In-Class Exercise #7 • Please compute first and follow sets for the following grammar: S -> ABCd A -> e | f | B -> g | h | C -> p | q