160 likes | 288 Views
Com2010 - Functional Programming Demos: LexPrs & PrsRes & Software Engineering Design and Coding Marian Gheorghe Lecture 18 Module homepage Mole & http://www.dcs.shef.ac.uk/~marian. com2010. © University of Sheffield. What have we learned?.
E N D
Com2010 - Functional Programming Demos: LexPrs & PrsRes & Software Engineering Design and Coding Marian Gheorghe Lecture 18 Module homepage Mole & http://www.dcs.shef.ac.uk/~marian com2010 ©University of Sheffield
What have we learned? • Computational models utilised lexical analysis and parsing (regular expressions, finite state machines, EBNF, syntax diagrams etc) • Haskell key concepts: • Complex data structures: algebraic types (simple, compound, polymorphic, recursive) • Recursive functions – recursive descent parser • Higher order functions – three basic parsing diagrams • List comprehension • Software engineering approach on designing, coding and maintenance com2010 ©University of Sheffield
Compiling AST Compiler Semantics; code, evaluation… Lexical analysis Parsing Source file Token units AST ? "a = a + 1 " ++ “/*comment*/" [(1,"a"),(5,"="),(1,"a"), (3,"+"),(2,"1"),(0,"Eop")] ? ? spaces, comment only in source AST – Abstract Syntax Tree Key - code Lexical unit - string com2010 ©University of Sheffield
LexPrs: Lexical analysis • Note how lex_aut represents various transitions (Move constructor) • It recognizes: identifiers, numbers, comments, delimiters (+,-,=,;), spaces • Invoked as: • lex_an lex_aut inLexN for • inLex1 = "ident 23 + - = ; /*comment*/ anotherIdentThenNumber12 124" • ~~> OK • inLex2 = "something 1 /*wrong_identifier*/ ident_y ok" • ~~> ident_y is not identifier • inLex3 = "something 2 /*wrongNumber*/ 1.24 ok" • ~~> 1.24 is not number • inLex4 = "something 3 /*wrongDelimiter*/ < ok" • ~~> < is not delimiter com2010 ©University of Sheffield
Compiling AST Compiler Semantics; code, evaluation… Lexical analysis Parsing Source file Token units AST ? "a = a + 1 " ++ “/*comment*/" [(1,"a"),(5,"="),(1,"a"), (3,"+"),(2,"1"),(0,"Eop")] ? ? spaces, comment only in source AST – Abstract Syntax Tree Key - code Lexical unit - string com2010 ©University of Sheffield
LexPrs: parsing • Note how basic diagrams are implemented • Parser is invoked as • fProgram (lex_an lex_aut inPr1) • Tests: • inPr1 = "a = 0 ; a = a + 1 ; b = a + a - 2“ • ~~> Ok • inPr2 = "a = 0 a = a + 1 ; b = a“ • ~~> missing ; • inPr3 = "a = 0 ; a = a + ; b = a" • ~~> missing operand • inPr4 = "a = 0 ; a = a 1 ; b = a" • ~~> missing operator (+) com2010 ©University of Sheffield
Compiling AST Compiler Semantics; code, evaluation… Lexical analysis Parsing Source file Token units AST ? "a = a + 1 " ++ “/*comment*/" [(1,"a"),(5,"="),(1,"a"), (3,"+"),(2,"1"),(0,"Eop")] ? ? spaces, comment only in source AST – Abstract Syntax Tree Key - code Lexical unit - string com2010 ©University of Sheffield
PrsRes: Postfix notation eval • SetOf TokenUnit replaced by ParserUnit • Changes in the Parser code (effect terminal rules) • Extract postfix notation • Invoked as • snd(snd(fProgram (lex_an lex_aut inp,([],[])))) • Or • extractOut inp • Tests • inp1 = "a = 0 ; a = a + 1 ; b = a + a - 2“ • ~~> • [(1,"a"),(2,"0"),(5,"="),(6,";"),(1,"a"),(1,"a"),(2,"1"),(3,"+"),(5,"="),(6,";"),(1,"b"), (1,"a"), (1,"a"),(3,"+"),(2,"2"),(4,"-"),(5,"="),(0,"Eop")] com2010 ©University of Sheffield
PrsRes: Postfix notation eval (2) • Extract expressions • Invoked as • extractExp(extractOut inp ,[]) • Tests • inp2 = "a = 6 + 2 ; b = 8 ; c = 2 - 1 + 3" • ~~> • ([],[("a",["6","2","+"]),("b",["8"]),("c",["2","1","-","3","+"])]) • Expression evaluation • Invoked as • evalSA(snd(extractExp(extractOut inp2 ,[]))) • Tests • inp2 – above • [("a",8),("b",8),("c",4)] com2010 ©University of Sheffield
Software Engineering Haskell Projects • Lexical analyser requires changes in the associated model (FSM), lex_aut –see the code • Parser consists of a set of recursive functions – it requires a stepwise process with short iterations • Implementing all SA functions (15) in one step simply does not work • why? – get a complex set of recursive invocations impossible to manage and debug • need a strategy... Initial step: com2010 ©University of Sheffield
Instead of the Entire SA 1. Program :: = (S) 7. Trm::= (A) StmtList Eop Identifier 2. StmtList ::= (I) Number Assign 8. Operator ::= (A) Delim AddOp 3. Assign ::= (S) MinOp LHandS RestAss 9. LHandS ::= ident(T) 4. RestAss ::= (S) 10. AssSymb ::= assg (T) AssSymb Exp 11. Identifier ::= ident(T) 5. Exp ::= (I) 12. Number ::= no(T) Trm 13. Delim ::= sc (T) Operator 14. Addop ::= pls (T) 6. Eop ::= eop(T)15. MinOp ::= mns (T) com2010 ©University of Sheffield
Create an Initial Step • Use only diagrams 1, 6 and change 2 to a simpler form, i.e.: • 2. StmtList ::= ident • So, you have only 3 simple diagrams: a sequence (1) and two terminals (2, 6) • These are simple to be tested. • Test program: • (See the code...) “a /*just_an_identifier*/” com2010 ©University of Sheffield
Next Step • Revert diagram 2 to its initial form, i.e.: • 2. StmtList ::= Assign {Delim Assign} • where Assign is simply a terminal and Delim keeps its definition, i.e.: • 3. Assign ::= ident • 13. Delim ::= sc • You have: a sequence (1), an iteration (2) and two terminals (6, 13) • These will be tested by: • (See the code...) “a ; a1 ; b /*just_identifiers_and_sc*/” com2010 ©University of Sheffield
Generic Step • Based on the previous step, add on one or two new diagrams • The associated components are made terminals • Alternation and/or iteration sets, if any, might be adjusted • Write a test set for this new iteration • Looks like an XP approach! com2010 ©University of Sheffield
Finally, the Back-end • To implement the back-end, the following transformations are made into the parser • The generic SetOf TokenUnit is replaced • Adequate output is implemented by adding certain functions the terminal rules (usually fTerm transformed into seqOf) • Functions to extract the output • Functions implementing requested transformations (postfix notation etc.) com2010 ©University of Sheffield
Where to find information • Lecture notes – Mole • Lecture slides + LexPrs & PrsRes code on my page http://www.dcs.shef.ac.uk/~marian com2010 ©University of Sheffield