630 likes | 735 Views
Tools for Refactoring Functional Programs. Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp. Design. Models Prototypes Design documents Visible artifacts. All in the code …. Functional programs embody their design in their code.
E N D
Tools for Refactoring Functional Programs Simon Thompson with Huiqing Li Claus Reinke www.cs.kent.ac.uk/projects/refactor-fp
Design • Models • Prototypes • Design documents • Visible artifacts
All in the code … • Functional programs embody their design in their code. • This is enabled by their high-level nature: constructs, types … data Message = Message Head Body data Head = Head Metadata Title data Metadata = Metadata [Tags] type Title = String …
Evolution • Successful systems are long lived … • … and evolve continuously. • Supporting evolution of code and design?
Soft-Ware • There’s no single correct design … • … different options for different situations. • Maintain flexibility as the system evolves.
Refactoring • Refactoring means changing the design or structure of a program … without changing its behaviour. Modify Refactor
Not just programming • Paper or presentation • moving sections about; amalgamate sections; move inline code to a figure; animation; … • Proof • add lemma; remove, amalgamate hypotheses, … • Program • the topic of the lecture
Splitting a function • module Split where • f :: [String] -> String • f ys = foldr (++) [] [ y++"\n" | y <- ys ]
Splitting a function • module Split where • f :: [String] -> String • f ys = foldr (++) [][ y++"\n" | y <- ys ]
Splitting a function • module Split where • f :: [String] -> String • f ys = join [y ++ "\n" | y <- ys] • where • join = foldr (++) []
Splitting a function • module Split where • f :: [String] -> String • f ys = join [y ++ "\n" | y <- ys] • where • join = foldr (++) []
Splitting a function • module Split where • f :: [String] -> String • f ys = join addNL • where • join zs = foldr (++) [] zs • addNL = [y ++ "\n" | y <- ys]
Splitting a function • module Split where • f :: [String] -> String • f ys = join addNL • where • join zs = foldr (++) [] zs • addNL = [y ++ "\n" | y <- ys]
Splitting a function • module Split where • f :: [String] -> String • f ys = join (addNL ys) • where • join zs = foldr (++) [] zs • addNL ys = [y ++ "\n" | y <- ys]
Splitting a function • module Split where • f :: [String] -> String • f ys = join (addNL ys) • where • join zs = foldr (++) [] zs • addNL ys = [y ++ "\n" | y <- ys]
Splitting a function • module Split where • f :: [String] -> String • f ys = join (addNL ys) • join zs = foldr (++) [] zs • addNL ys = [y ++ "\n" | y <- ys]
Overview • Example refactorings: what they involve. • Building the HaRe tool. • Design rationale. • Infrastructure. • Haskell and Erlang. • The Wrangler tool. • Conclusions.
Haskell 98 • Standard, lazy, strongly typed, functional programming language. • Layout is significant … “offside rule” … and idiosyncratic. doSwap pnt = applyTP(full_buTP (idTP `adhocTP` inMatch `adhocTP` inExp `adhocTP` inDecl)) where inMatch ((HsMatch loc fun pats rhs ds)::HsMatchP) | fun == pnt = case pats of (p1:p2:ps) -> do pats'<-swap p1 p2 pats return (HsMatch loc fun pats' rhs ds) _ -> error "Insufficient arguments to swap." inMatch m = return m inExp exp@((Exp (HsApp (Exp (HsApp e e1)) e2))::HsExpP) | expToPNT e == pnt = swap e1 e2 exp inExp e = return e
Why refactor Haskell? • The only design artefact is (in) the code. • Semantics of functional languages support large-scale transformations (?) • Building real tools to support functional programming … heavy lifting. • Platform for research and experimentation.
f x y = … h … where h = … Hide a function which is clearly subsidiary to f; clear up the namespace. f x y = … (h y) … h y = … Makes h accessible to the other functions in the module and beyond. Lift / demote Free variables: which parameters of f are used in h? Need h not to be defined at the top level, … , Type of h will generally change … .
Algebraic or abstract type? • data Tr a • = Leaf a | • Node a (Tr a) (Tr a) • flatten :: Tr a -> [a] • flatten (Leaf x) = [x] • flatten (Node s t) • = flatten s ++ • flatten t • Tr • Leaf • Node
Algebraic or abstract type? • Tr • isLeaf • isNode • leaf • left • right • mkLeaf • mkNode • data Tr a • = Leaf a | • Node a (Tr a) (Tr a) • isLeaf = … • isNode = … • … • flatten :: Tr a -> [a] • flatten t • | isleaf t = [leaf t] • | isNode t • = flatten (left t) • ++ flatten (right t)
Information required • Lexical structure of programs, • abstract syntax, • binding structure, • type system and • module system.
Program transformations • Program optimisationsource-to-source transformations to get more efficient code • Program derivationcalculating efficient code from obviously correct specifications • Refactoringtransforming code structure usually bidirectional and conditional. • Refactoring = Transformation + Condition
Conditions: renaming f to g • “No change to the binding structure” • No two definitions of g at the same level. • No capture ofg. • No capture byg.
h x = … h … f … g … where g y = … f x = … h x = … h … g … g … where g y = … g x = … Capture of renamed identifier
h x = … h … f … g … where f y = … f … g … g x = … h x = … h … g … g … where g y = … g … g … g x = … Capture by renamed identifier
Refactoring by hand? • By hand = in a text editor • Tedious • Error-prone • Implementing the transformation … • … and the conditions. • Depends on compiler for type checking, … • … plus extensive testing.
Machine support invaluable • Reliable • Low cost of do / undo, even for large refactorings. • Increased effectiveness and creativity.
The refactorings in HaRe Move def between modules Delete/add to exports Clean imports Make imports explicit data type to ADT Short-cut, warm fusion All module aware • Rename • Delete • Lift / Demote • Introduce definition • Remove definition • Unfold • Generalise • Add/remove parameters
HaRe design rationale • Integrate with existing development tools. • Work with the complete language: Haskell 98 • Preserve comments and the formatting style. • Reuse existing libraries and systems. • Extensibility and scriptability.
Information required • Lexical structure of programs, • abstract syntax, • binding structure, • type system and • module system.
The Implementation of HaRe Information gathering Pre-condition checking Strafunski Program transformation Program rendering
Finding free variables ‘by hand’ • instance FreeVbls HsExp where • freeVbls (HsVar v) = [v] • freeVbls (HsApp f e) • = freeVbls f ++ freeVbls e • freeVbls (HsLambda ps e) • = freeVbls e \\ concatMap paramNames ps • freeVbls (HsCase exp cases) • = freeVbls exp ++ concatMap freeVbls cases • freeVbls (HsTuple _ es) • = concatMap freeVbls es … • Boilerplate code: 1000 noise : 100 significant.
Strafunski • Strafunski allows a user to write general (read generic), type safe, tree traversing programs, with ad hoc behaviour at particular points. • Top-down / bottom up, type preserving / unifying, full stop one
Strafunski in use • Traverse the tree accumulating free variables from components, except in the case of lambda abstraction, local scopes, … • Strafunski allows us to work within Haskell … • Other options? Generic Haskell, Template Haskell, AG, Scrap Your Boilerplate, …
Rename an identifier • rename:: (Term t)=>PName->HsName->t->Maybe t • rename oldName newName =applyTPworker • where • worker =full_tdTP(idTP‘adhocTP‘idSite) • idSite :: PName -> Maybe PName • idSite v@(PN name orig) • | v == oldName • = return (PN newName orig) • idSite pn = return pn
The coding effort • Transformations: straightforward in Strafunski … • … the chore is implementing conditions that the transformation preserves meaning. • This is where much of our code lies.
-- This is an example • module Main where • sumSquares x y = sq powx + sq powy • where pow = 2 :: Int • sq :: Int->Int->Int • sq pow x = x ^ pow • main = sumSquares 10 20 • module Main where • sumSquares x y • = sq powx + sq powy where pow = 2 :: Int • sq :: Int->Int->Int • sq pow x = x ^ pow • main = sumSquares 10 20 Program rendering example • -- This is an example • module Main where • sumSquares x y = sq x + sq y • where sq :: Int->Int • sq x = x ^ pow • pow = 2 :: Int • main = sumSquares 10 20
Token stream and AST • White space + comments only in token stream. • Modification of the AST guides the modification of the token stream. • After a refactoring, the program source is recovered from the token stream not the AST. • Heuristics associate comments with program entities.
Work in progress • ‘Fold’ against definitions … find duplicate code. • All, some or one? Effect on the interface … • f x = … e … e … • Symbolic evaluation • Data refactorings • Interfaces … ‘bad smell’ detection.
API and DSL Combining forms ??? Refactorings Refactoring utilities Library functions Grammar as data Strafunski Strafunski Haskell
What have we learned? • Efficiency and robustness of libraries in question. • type checking large systems, • linking, • editor script languages (vim, emacs). • The cost of infrastructure in building practical tools. • Reflections on Haskell itself.