1 / 77

Refactoring Functional Programs

Refactoring Functional Programs. Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp. Session 2. Overview. Review mini-project. Implementation of HaRe. Larger-scale examples. Case study. Mini-project feedback. Refactorings performed.

Download Presentation

Refactoring Functional Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp

  2. Session 2

  3. Overview • Review mini-project. • Implementation of HaRe. • Larger-scale examples. • Case study.

  4. Mini-project feedback • Refactorings performed. • Refactorings and language features? • Machine support feasible? Useful? • ‘Not-quite’ refactorings? Support possible here?

  5. Examples • Argument permutations (NB partial application). • (Un)group arguments. • Slice function for a component of its result. • Error handling / exception handling.

  6. More examples • Introduce type synonym, selectively. • Introduce ‘branded’ type. • Modify the return type of a function fromTtoMaybeT, Either T S, [T]. • Ditto for input types … and modify variable names correspondingly.

  7. Implementing HaRe

  8. Proof of concept … • To show proof of concept it is enough to: • build a stand-alone tool, • work with a subset of the language, • pretty print the results of refactorings.

  9. … or a useful tool? • Integrate with existing program development tools: stand-alone program links to editors emacs and vim, any other IDEs also possible. • Work with the complete language: Haskell 98? • Preservetheformattingand commentsin the refactored source code. • Allow users to extend and script the system.

  10. The refactorings in HaRe • Rename • Delete • Lift / Demote • Introduce definition • Remove definition • Unfold • Generalise • Add / remove params Move def between modules Delete /add to exports Clean imports Make imports explicit Data type to ADT All these refactorings are module aware.

  11. The Implementation of HaRe Information gathering Pre-condition checking Program transformation Program rendering

  12. Information needed • Syntax: replace the function called sq, not the variable sq…… parse tree. • Static semantics: replace this function sq, not all the sqfunctions …… scope information. • Module information: what is the traffic between this module and its clients …… call graph. • Type information: replace this identifier when it is used at this type …… type annotations.

  13. Infrastructure: decisions • Build a tool that caninteroperatewith emacs, vim, … yet actseparately. • Leverage existing libraries for processing Haskell 98, for tree transformation … as few modifications as possible. • Be as portable as possible, in the Haskell space. • Abstract interface to compiler internals?

  14. Haskell landscape (end 2002) • Parser: many • Type checker: few • Tree transformations: few • Difficulties • Haskell 98 vs. Haskell extensions. • Libraries: proof of concept vs. distributable. • Source code regeneration. • Real project

  15. Programatica • Project at OGI to build a Haskell system … • … with integral support for verification at various levels: assertion, testing, proof etc. • The Programatica project has built a Haskell front end in Haskell, supporting syntax, static, type and module analysis … • … freely available under BSD licence.

  16. The Implementation of HaRe Information gathering Pre-condition checking Program transformation Program rendering

  17. First steps … lifting and friends • Use the Haddock parser … full Haskell given in 500 lines of data type definitions. • Work by hand over the Haskell syntax: 27 cases for expressions … • Code for finding free variables, for instance …

  18. Finding free variables ‘by hand’ • instance FreeVbls HsExp where • freeVbls (HsVar v) = [v] • freeVbls (HsApp f e) • = freeVbls f ++ freeVbls e • freeVbls (HsLambda ps e) • = freeVbls e \\ concatMap paramNames ps • freeVbls (HsCase exp cases) • = freeVbls exp ++ concatMap freeVbls cases • freeVbls (HsTuple _ es) • = concatMap freeVbls es • … etc.

  19. This approach • Boilerplate code … 1000 lines for 100 lines of significant code. • Error prone: significant code lost in the noise. • Want to generate the boiler plate and the tree traversals … • … DriFT: Winstanley, Wallace • … Strafunski: Lämmel and Visser

  20. Strafunski • Strafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points. • Top-down / bottom up, type preserving / unifying, full stop one

  21. Strafunski in use • Traverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, … • Strafunski allows us to work within Haskell … • Other options? Generic Haskell, Template Haskell, AG, …

  22. Rename an identifier • rename:: (Term t)=>PName->HsName->t->Maybe t • rename oldName newName =applyTPworker • where • worker =full_tdTP(idTP‘adhocTP‘idSite) • idSite :: PName -> Maybe PName • idSite v@(PN name orig) • | v == oldName • = return (PN newName orig) • idSite pn = return pn

  23. The coding effort • Transformations: straightforward in Strafunski … • … the chore is implementing conditions that the transformation preserves meaning. • This is where much of our code lies.

  24. Move f from module A to B • Is fdefined at the top-level of B? • Are the free variables in faccessible within module B? • Will the move require recursive modules? • Remove the definition of f from module A. • Add the definition to module B. • Modify the import/export lists in module A, Band the client modules ofAandB if necessary. • Change uses of A.f to B.f or f in all affected modules. • Resolve ambiguity.

  25. The Implementation of HaRe Information gathering Pre-condition checking Program transformation Program rendering

  26. Program rendering example • -- This is an example • module Main where • sumSquares x y = sq x + sq y • where sq :: Int->Int • sq x = x ^ pow • pow = 2 :: Int • main = sumSquares 10 20 • Promote the definition of sq to top level

  27. Program rendering example • module Main where • sumSquares x y • = sq powx + sq powy where pow = 2 :: Int • sq :: Int->Int->Int • sq pow x = x ^ pow • main = sumSquares 10 20 • Using a pretty printer: comments lost and layout quite different.

  28. Program rendering example • -- This is an example • module Main where • sumSquares x y = sq x + sq y • where sq :: Int->Int • sq x = x ^ pow • pow = 2 :: Int • main = sumSquares 10 20 • Promote the definition of sq to top level

  29. Program rendering example • -- This is an example • module Main where • sumSquares x y = sq powx + sq powy • where pow = 2 :: Int • sq :: Int->Int->Int • sq pow x = x ^ pow • main = sumSquares 10 20 • Layout and comments preserved.

  30. Token stream and AST • White space and comments in the token stream. • Modification of the AST guides the modification of the token stream. • After a refactoring, the program source is extracted from the token stream not the AST. • Heuristics associate comments with program entities.

  31. Production tool Programatica parser and type checker Refactor using a Strafunski engine Render code from the token stream and syntax tree.

  32. Production tool (optimised) Programatica parser and type checker Refactor using a Strafunski engine Render code from the token stream and syntax tree. Pass lexical information to update the syntax tree and so avoid reparsing

  33. What have we learned? • Emerging Haskell libraries make it practical(?) • Efficiency and robustness • type checking large systems, • linking, • editor script languages (vim, emacs). • Limitations of editor interactions. • Reflections on Haskell itself.

  34. Reflections on Haskell • Cannot hide items in an export list (cf import). • Field names for prelude types? • Scoped class instances not supported. • ‘Ambiguity’ vs. name clash. • ‘Tab’ is a nightmare! • Correspondence principle fails …

  35. Correspondence • Operations on definitions and operations on expressions can be placed in one to one correspondence • (R.D.Tennent, 1980)

  36. Definitions where f x y = e f x | g1 = e1 | g2 = e2 Expressions let \x y -> e f x = if g1 then e1 else if g2 … … Correspondence

  37. f x | g1 = e1 f x | g2 = e2 Can ‘fall through’ a function clause … no direct correspondence in the expression language. f x = if g1 then e1 else if g2 … No clauses for anonymous functions … no reason to omit them. Function clauses

  38. Work in progress • ‘Fold’ against definitions … find duplicate code. • All, some or one? Effect on the interface … • f x = … e … e … • Traditional program transformations • Short-cut fusion • Warm fusion

  39. Where next? • Opening up to users: API or little language? • Link with other IDEs (and front ends?). • Detecting ‘bad smells’. • More useful refactorings supported by us. • Working without source code.

  40. API Refactorings Refactoring utilities Strafunski Haskell

  41. DSL Combining forms Refactorings Refactoring utilities Strafunski Haskell

  42. Larger-scale examples • More complex examples in the functional domain; often link with data types. • Dawning realisation that can some refactorings are pretty powerful. • Bidirectional … no right answer.

  43. Algebraic or abstract type? • data Tr a • = Leaf a | • Node a (Tr a) (Tr a) • flatten :: Tr a -> [a] • flatten (Leaf x) = [x] • flatten (Node s t) • = flatten s ++ • flatten t • Tr • Leaf • Node

  44. Algebraic or abstract type? • Tr • isLeaf • isNode • leaf • left • right • mkLeaf • mkNode • data Tr a • = Leaf a | • Node a (Tr a) (Tr a) • isLeaf = … • isNode = … • … • flatten :: Tr a -> [a] • flatten t • | isleaf t = [leaf t] • | isNode t • = flatten (left t) • ++ flatten (right t)

  45. Pattern matching syntax is more direct … … but can achieve a considerable amount with field names. Other reasons? Simplicity (due to other refactoring steps?).  Allows changes in the implementation type without affecting the client: e.g. might memoise Problematic with a primitivetype as carrier. Allows an invariant to be preserved. Algebraic or abstract type?

  46. Outside or inside? • Tr • isLeaf • isNode • leaf • left • right • mkLeaf • mkNode • data Tr a • = Leaf a | • Node a (Tr a) (Tr a) • isLeaf = … • isNode = … • … • flatten :: Tr a -> [a] • flatten t • | isleaf t = [leaf t] • | isNode t • = flatten (left t) • ++ flatten (right t)

  47. Outside or inside? • Tr • isLeaf • isNode • leaf • left • right • mkLeaf • mkNode • flatten • data Tr a • = Leaf a | • Node a (Tr a) (Tr a) • isLeaf = … • isNode = … • flatten t = …

  48. If inside and the type is reimplemented, need to reimplement everything in the signature, including flatten. The more outside the better, therefore.  If inside can modify the implementation to memoise values of flatten, or to give a better implementation using the concrete type. Layered types possible: put the utilities in a privileged zone. Outside or inside?

  49. Memoise flatten :: Tr a->[a] • data Tree a • = Leaf { val::a } | • Node { val::a, • left,right::(Tree a) } • leaf = Leaf • node = Node • flatten (Leaf x) = [x] • flatten (Node x l r) = • (x : (flatten l ++ flatten r)) • data Tree a • = Leaf { val::a, • flatten:: [a] } | • Node { val::a, • left,right::(Tree a), • flatten::[a] } • leaf x = Leaf x [x] • node x l r • = Node x l r (x : (flatten l ++ • flatten r))

  50. Memoise flatten • Invisible outside the implementation module, if tree type is already an ADT. • Field names in Haskell make it particularly straightforward.

More Related