1 / 46

Scrap your boilerplate: generic programming in Haskell

Understand boilerplate code issues & solutions in Haskell for code elegance & efficiency. Learn to eliminate repetitive coding patterns effectively.

Download Presentation

Scrap your boilerplate: generic programming in Haskell

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scrap your boilerplate:generic programming in Haskell Ralf Lämmel, Vrije University Simon Peyton Jones, Microsoft Research

  2. The problem: boilerplate code Company Dept “Research” Dept “Production” Manager Manager Dept “Devt” “Bill” £15k “Fred” £10k Dept “Manuf” Employee Find all people in tree and increase their salary by 10% “Fred” £10k

  3. The problem: boilerplate code data Company = C [Dept] data Dept = D Name Manager [SubUnit] data SubUnit = PU Employee | DU Dept data Employee = E Person Salary data Person = P Name Address data Salary = S Float type Manager = Employee type Name = String type Address = String incSal :: Float -> Company -> Company

  4. The problem: boilerplate code incSal :: Float -> Company -> Company incSal k (C ds) = C (map (incD k) ds) incD :: Float -> Dept -> Dept incD k (D n m us) = D n (incE k m) (map (incU k) us) incU :: Float -> SubUnit -> SubUnit incU k (PU e) = incE k e incU k (DU d) = incD k d incE :: Float -> Employee -> Employee incE k (E p s) = E p (incS k s) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f)

  5. Boilerplate is bad • Boilerplate is tedious to write • Boilerplate is fragile: needs to be changed when data type changes (“schema evolution”) • Boilerplate obscures the key bits of code

  6. Getting rid of boilerplate • Use an un-typed language, with a fixed collection of data types • Convert to a universal type and write (untyped) traversals over that • Use “reflection” to query types and traverse child nodes

  7. Getting rid of boilerplate • Generic (aka polytypic) programming: define function by induction over the (structure of the) type of its argument • PhD required. Elegant only for “totally generic” functions (read, show, equality) generic inc<t> :: Float -> t -> t inc<1> k Unit = Unit inc<a+b> k (Inl x) = Inl (inc<a> k x) inc<a+b> k (Inr y) = Inr (inc<b> k y) inc<a*b> k (x, y) = (inc<a> k x, inc<a> k y)

  8. Our solution Generic programming for the rest of us Typed language Works for arbitrary data types: parameterised, mutually recursive, nested... No encoding to/from some other type Very modest language support Elegant application of Haskell's type classes

  9. Our solution incSal :: Float -> Company -> Company incSal k = everywhere (mkT (incS k)) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f)

  10. Two ingredients incSal :: Float -> Company -> Company incSal k = everywhere (mkT (incS k)) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f) 2. Apply a function to every node in the tree 1. Build the function to apply to every node, from incS

  11. Type classes member :: a -> [a] -> Bool member x [] = False member x (y:ys) | x==y = True | otherwise = member x ys No! member is not truly polymorphic: it does not work for any type a, only for those on which equality is defined.

  12. Type classes member :: Eq a => a -> [a] -> Bool member x [] = False member x (y:ys) | x==y = True | otherwise = member x ys The class constraint "Eq a" says that member only works on types that belong to class Eq.

  13. Type classes class Eq a where (==) :: a -> a -> Bool instance Eq Int where (==) i1 i2 = eqInt i1 i2 instance (Eq a) => Eq [a] where (==)[] [] = True (==)(x:xs) (y:ys) = (x == y) && (xs == ys) (==)xs ys = False member :: Eq a => a -> [a] -> Bool member x [] = False member x (y:ys) | x==y = True | otherwise = member x ys

  14. Implementing type classes data Eq a = MkEq (a->a->Bool) eq (MkEq e) = e dEqInt :: Eq Int dEqInt = MkEq eqInt dEqList :: Eq a -> Eq [a] dEqList (MkEq e) = MkEq el where el [] [] = True el (x:xs) (y:ys) = x `e` y && xs `el` ys el xs ys = False member :: Eq a -> a -> [a] -> Bool member d x [] = False member d x (y:ys) | eq d x y = True | otherwise = member d x ys Class witnessed by a “dictionary” of methods Instance declarations create dictionaries Overloaded functions take extra dictionary parameter(s)

  15. Ingredient 1: type extension (mkT f) is a function that • behaves just like f on arguments whose type is compatible with f's, • behaves like the identity function on all other arguments So applying (mkT (incS k)) to all nodes in the tree will do what we want.

  16. Type safe cast cast :: (Typeable a, Typeable b) => a -> Maybe b ghci> (cast 'a') :: Maybe Char Just 'a' ghci> (cast 'a') :: Maybe Bool Nothing ghci> (cast True) :: Maybe Bool Just True

  17. Type extension mkT :: (Typeable a, Typeable b) => (a->a) -> (b->b) mkT f = case cast f of Just g -> g Nothing -> id ghci> (mkT not) True False ghci> (mkT not) 'a' 'a'

  18. Implementing cast An Int, perhaps data TypeRep instance Eq TypeRep mkRep :: String -> [TypeRep] -> TypeRep class Typeable a where typeOf :: a -> TypeRep instance Typeable Int where typeOf i = mkRep "Int" [] Guaranteed not to evaluate its argument

  19. Implementing cast class Typeable a where typeOf :: a -> TypeRep instance (Typeable a, Typeable b) => Typeable (a,b) where typeOf p = mkRep "(,)" [ta,tb] where ta = typeOf (fst p) tb = typeOf (snd p)

  20. Implementing cast cast :: (Typeable a, Typeable b) => a -> Maybe b cast x = r where r = if typeOf x = typeOf (get r) then Just (unsafeCoerce x) else Nothing get :: Maybe a -> a get x = undefined

  21. Implementing cast • In GHC: • Typeable instances are generated automatically by the compiler for any data type • The definition of cast is in a library • Then cast is sound • Bottom line: cast is best thought of as a language extension, but it is an easy one to implement. All the hard work is done by type classes

  22. Two ingredients incSal :: Float -> Company -> Company incSal k = everywhere (mkT (incS k)) incS :: Float -> Salary -> Salary incS k (S f) = S (k*f) 2. Apply a function to every node in the tree 1. Build the function to apply to every node, from incS

  23. Ingredient 2: traversal • Step 1: implement one-layer traversal • Step 2: extend one-layer traversal to recursive traversal of the entire tree

  24. One-layer traversal class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a instance Data Int where gmapT f x = x instance (Data a,Data b) => Data (a,b) where gmapT f (x,y) = (f x, f y) (gmapT f x) applies f to the IMMEDIATE CHILDREN of x

  25. One-layer traversal class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a instance (Data a) => Data [a] where gmapT f [] = [] gmapT f (x:xs) = f x : f xs -- !!! gmapT's argument is a polymorphic function; so gmapT has a rank-2 type

  26. Step 2: Now traversals are easy! everywhere :: Data a => (forall b. Data b => b -> b) -> a -> a everywhere f x = f (gmapT (everywhere f) x)

  27. Many different traversals! everywhere, everywhere' :: Data a => (forall b. Data b => b -> b) -> a -> a everywhere f x = f (gmapT (everywhere f) x) -- Bottom up everywhere' f x = gmapT (everywhere' f) (f x)) -- Top down

  28. More perspicuous types everywhere :: Data a => (forall b. Data b => b -> b) -> a -> a everywhere :: (forall b. Data b => b -> b) -> (forall a. Data a => a -> a) type GenericT = forall a. Data a => a -> a everywhere :: GenericT -> GenericT Aha!

  29. What is "really going on"? inc :: Data t => Float -> t -> t • The magic of type classes passes an extra argument to inc that contains: • The function gmapT • The function typeOf • A call of (mkTincS), done at every node in tree, entails a comparison of the TypeRep returned by the passed-in typeOf with a fixed TypeRep for Salary; this is precisely a dynamic type check

  30. Summary so far • Solution consists of: • A little user-written code • Mechanically generated instances for Typeable and Data for each data type • A library of combinators (cast, mkT, everywhere, etc) • Language support: • cast • rank-2 types • Efficiency is so-so (factor of 2-3 with no effort)

  31. Summary so far • Robust to data type evolution • Works easily for weird data types data Rose a = MkR a [Rose a] instance (Data a) => Data (Rose a) where gmapT f (MkR x rs) = MkR (f x) (f rs) data Flip a b = Nil | Cons a (Flip b a) -- Etc...

  32. Generalisations • With this same language support, we can do much more • generic queries • generic monadic operations • generic folds • generic zips (e.g. equality)

  33. Generic queries • Add up the salaries of all the employees in the tree salaryBill :: Company -> Float salaryBill = everything (+) (0 `mkQ` billS) billS :: Salary -> Float billS (S f) = f 2. Apply the function to every node in the tree, and combine results with (+) 1. Build the function to apply to every node, from billS

  34. Type extension again mkQ :: (Typeable a, Typeable b) => d -> (b->d) -> a -> d (d `mkQ` q) a = case cast a of Just b -> q b Nothing -> d ghci> (22 `mkQ` ord) 'a' 97 ghci> (22 `mkQ` ord) True 22 Apply 'q' if its type fits, otherwise return 'd' ord :: Char -> Int

  35. Traversal again class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r] Apply a function to all children of this node, and collect the results in a list

  36. Traversal again class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r] instance Data Int where gmapQ f x = [] instance (Data a,Data b) => Data (a,b) where gmapQ f (x,y) = f x ++ f y

  37. The query traversal everything :: Data a => (r->r->r) -> (forall b. Data b => b -> r) -> a -> r everything k f x = foldl k (f x) (gmapQ (everything f) x) Note that foldr vs foldl is in the traversal, not gmapQ

  38. Looking for one result • By making the result type be (Maybe r), we can find the first (or last) satisfying value [laziness] findDept :: String -> Company -> Maybe Dept findDept s = everything `orElse` (Nothing `mkQ` findD s) findD :: String -> Dept -> Maybe Dept findD s d@(D s' _ _) = if s==s' then Just d else Nothing

  39. Monadic transforms class Typeable a => Data a where gmapT :: (forall b. Data b => b -> b) -> a -> a gmapQ :: forall r. (forall b. Data b => b -> r) -> a -> [r] gmapM :: Monad m => (forall b. Data b => b -> m b) -> a -> m a • Uh oh! Where do we stop?

  40. Where do we stop? • Happily, we can generalise all three gmaps into one data Employee = E Person Salary instance Data Employee where gfoldl k z (E p s) = (z E `k` p) `k` s • We can define gmapT, gmapQ, gmapM in terms of (suitably parameterised) gfoldl • The type of gfoldl hurts the brain (but the definitions are all easy)

  41. Where do we stop? class Typeable a => Data a where gfoldl :: (forall a b. Data a => c (a -> b) -> a -> c b) -> (forall g. g -> c g) -> a -> c a

  42. But we still can't do show! • Want show :: Data a => a -> String show :: Data a => a -> String show t = ??? ++ concat (gmapQ show t) show the children and concatenate the results But how to show the constructor?

  43. Add more to class Data class Data a where toConstr :: a -> Constr data Constr -- abstract conString :: Constr -> String conFixity :: Constr -> Fixity • Very like typeOf :: Typeablea=>a->TypeRepexcept only for data types, not functions

  44. So here is show • Simple refinements to deal with parentheses, infix constructors etc • toConstr on a primitive type (like Int) yields a Constr whose conString displays the value show :: Data a => a -> String show t = conString (toConstr t) ++ concat (gmapQ show t)

  45. Further generic functions • read :: Data a => String -> a • toBin :: Data a => a -> [Bit]fromBin :: Data a => [Bit] -> a • testGen :: Data a => RandomGen -> a class Data a where toConstr :: a -> Constr fromConstr :: Constr -> a dataTypeOf :: a -> DataType data DataType -- Abstract stringCon :: DataType -> String -> Maybe Constr indexCon :: DataType -> Int -> Constr dataTypeCons :: DataType -> [Constr]

  46. Conclusions • “Simple”, elegant • Modest language extensions • Rank-2 types • Auto-generation of Typeable, Data instances Fully implemented in GHC • Shortcomings: • Stop conditions • Types are a bit uninformative Paper: http://research.microsoft.com/~simonpj

More Related