400 likes | 551 Views
Introduction to Compilation of Functional Languages. Wanhe Zhang Computing and Software Department McMaster University 16 th , March, 2004. Functional Programs. Based on the idea that a program is a function with one input parameter, its input and one result, its output.
E N D
Introduction to Compilation of Functional Languages Wanhe Zhang Computing and Software Department McMaster University 16th, March, 2004
Functional Programs • Based on the idea that a program is a function with one input parameter, its input and one result, its output. • Difference between functional and imperative language is efficiency consideration and readability. We can see that from the factorial function below:
Compilation of Functional Languages • A short tour of Haskell • Compiling functional languages • Polymorphic type checking • Desugaring
Tour of Haskell • Function application syntax: f 11 13 1)No bracket around the arguments, allows currying to be expressed naturally. 2)Function application binds stronger than any operator. g n+1 is (g n) + 1 rather than g (n+1)
Tour of Haskell • Offside rule • Lists • List comprehension • Pattern matching • Polymorphic typing • Referential transparency • Higher-order functions • Lazy evaluation
Offside rule • Divide x 0 = inf Divide x y = x/y • An equation consists of a left-hand side, followed by the = token, followed by the right-hand side. No explicit token to denote the end of each equation. If treats line break as the terminator is inconvenient. • Offside rule controls the bounding box of an expression.
Offside rule • Everything below and to the right of the = token is defined to be part of the expression making up the right-hand side. • The right-hand side terminates before the first token that is ‘offside’-to the left-of the = position.
Lists • The polymorphic typing of Haskell, does not allow lists to contain elements of different. • [] [1,2,3,4] = (1:(2:(3:(4:[])))) [“red”, “yellow”] [1..10]
List Comprehension • Syntax that closely matches set notation. • S = [n^2 | n <- [1..100], odd n] • List comprehension generates lists rather than sets: ordering is important and elements may occur multiple times in list comprehensions. • It is convenient to use when generating new lists from old ones.
List Comprehension • qsort [] = [] qsort (x:xs) = qsort [y | y <- xs, y<x] ++ [x] ++ qsort [y | y <- xs, y >= x]
Pattern Matching • fac 0 = 1 fac n = n*fac (n-1) • Function equations are matched from top to bottom; the patterns in them are matched from left to right. • Pattern matching can be translated easily into equivalent definitions based on if-then-else constructs. • Fac n = if (n == 0) then 1 else n * fac (n-1)
Polymorphic Typing • An expression is said to be polymorphic if it ‘has many types’. • List [] has many types: list of characters, list of numbers and an infinite number of others. • The main advantage of polymorphic typing is that functions and data structures can be reused for any desired type instance. • Type checking will discussed later.
Referential Transparency • A fixed relation between inputs and output: f arg will produce the same output no matter what the overall state of the computation is. • Imperative languages, assignments to global variables and through pointers may cause two calls f arg to yield different result. • The advantage is that it simplifies program analysis and transformation. The bad thing is that it prevents the programmer from writing space-efficient programs that use in-space updates. • Add_one [] = [] add_one (x:xs) = x+1 : add_one xs • In imperative language, we can update the input list in-place.
Higher-order Functions • Higher-order function is defined as a function that takes a function as an argument, or delivers one as a result. • Imperative languages barely support higher-order functions: functions may perhaps be passed as parameters, cannot create a new one. • Two way to create new functions: 1) diff f = f_ where f_ x = (f ( x + h ) – f x) / h h = 0.0001 diff returns as its result a ‘new’ function that is composed out of already existing functions. 2) diff f x = ( f ( x + h ) – f x) / h where h = 0.0001 Apply an existing function to a number of arguments that is less than the arity of the function.-------Currying
Lazy Evaluation • Lazy evaluation relaxes these constraints by specifying that a subexpression will only be evaluated when its value is needed for the progress of the computation.
Compiling Functional Languages • Below is the compiler phase handles which aspect of Haskell:
The Functional Core • It must be high-level enough to serve as an easy target for the front-end that compiles the syntactic sugar away. • It must be low-level enough to allow concise descriptions of optimizations, which are often expressed as a case analysis of all core constructs.
Functional Core of Haskell • Basic data types, including int, char, and bool; • (user-defined) structured data types; • Typed non-nesting functions; • Local bindings as part of let-expressions; • Expressions consisting of identifiers, arithmetic operators, if-then-else compounds, and function applications; • Higher-order functions; (cannot map onto C) • Lazy evaluation semantics. (cannot map onto C)
Polymorphic Type checking • We illustrate this by an example: map f [] = [] map f ( x : xs ) = f x : map f xs • First equation: map :: a -> [b] -> [c] • Second equation: map :: (b -> c) -> [b] -> [c]) • For the second, x is an element of the list with type [b] and that f x is a part of map’s result list with type [c], so the type of f is b -> c
Polymorphic Function Application • Map :: ( a -> b ) -> [a] -> [b] length :: [c] -> Int map length • The type checker must unify the type of length, which is [c] -> Int, with the type of map’s first argument, a -> b. => a = [c], b = Int. • Map :: ( [c] -> Int ) -> [[c]] -> [Int] map length :: [[c]] -> [Int]
Desugaring • Transform a Haskell program into its functional-core equivalent. • We will focus on translating lists, pattern matching, list comprehension, and nested functions to core constructs.
The Translation of Lists • Three forms of syntactic sugar : , .. • The operator : constructs a node with three fields: a type tag Cons, an element, and a list. x : xs is transformed to (Cons x xs) [1,2] is transformed to (Cons 1(Cons 2 [])) • [1..] is usually translated to calls of library functions that express these lists in terms of : and [].
The Translation of Pattern Matching • A constant yields an equality test. • A variable imposes no constraint at all. • Constructor patterns, require additional support to provide type information at run time. We must be able to verify that an argument matches the constructor specified in pattern
Constructor Patterns • The run-time support to provide the _type_constr function that returns the constructor tag of an arbitrary structured type element. • Reference the fields in the constructor type. • The run-time assists us by providing the generic _type_field n function that returns the nth field of any structured type. • We will illustrate above by an example below:
Constructor Patterns • take 0 xs = [] take n [] = [] take n (x: xs) = x : take (n-1) xs
Optimization • The code has already been type-checked at compile time, any second argument in a call of take is guaranteed to be a list. So the last equation need not verify that the argument matches the constructor pattern, and the error guard can be omitted too.
The Translation of List Comprehension • [expression | qualifier, ..., qualifier] • Qualifier is either a generator or a filter. • A generator is of the form var <- list expression; it introduces a variable iteration over a list. • A filter is a Boolean expression,which constrains the variables generated by earlier qualifiers.
The Translation of List Comprehension • The transformation works by processing the qualifiers from left to right one at a time. This approach naturally leads to a recursive scheme as presented below.
The Translation of List Comprehension • Transformation rule (1) covers the base case where no more qualifiers are present in the list comprehension. • The filter qualifier is handled in transformation rule(2), where F stands for the filter and Q stands the remaining sequence of qualifiers. • The generator qualifier e <- L is covered in rule (3) The generator produces zero or more elements e drawn from a list L. We must generate code to iterate over all elements e, Compute the remainder Q of the list comprehension for each value of e, and concatenate the – possible empty- result lists into a single list. • The key idea for rule (3) is a nested function takes element e and produces the list of values that Q can assume for e. The function then is called over all the elements in L.
The Translation of List Comprehension • We need calling a function for all elements in a list; Concatenation the elements of the resulting lists into one. • Map function does not work. map f [] = [] map f ( x : xs ) = f x : map f xs It simply concatenates the results of function applications, and would yield a list of lists in this case. • Modified map: mappend :: (a -> [b]) -> [a] -> [b] mappend f [] = [] mappend f (x:xs) = f x ++ mappend f xs
The Translation of List Comprehension • Below we will illustrate the theory above: • Pyth n = [(a, b, c) | a <- [1 .. n], b <- [a .. n], c <- [b .. n], a^2 + b^2 == c^2] • Transformation of Pyth: • Pyth n = mappend f_bc2 [1..n] where f_bc2 a = mappend f_2 [b .. n] where f_2 c = if (a^2 + b^2 == c^2) then [(a, b, c)] else []
The Translation of Nested Functions • Since most target languages of functional compilers don’t support nested routines. The functional core excludes nested functions. • Using lexical pointers to activation records in combination with higher-order functions and lazy evaluation causes dynamic scope violations, since a call to a nested function may escape its lexical scope at run time, rendering its lexical pointer invalid. • For example, a nested function can be returned as the result of higher-order function; lazy evaluation can delay the execution of a call to the nested function until the caller has returned its value, contains a reference to the suspended call.
The Translation of Nested Functions • Example: • Sv_mul defines the multiplicaiton of a scalar and a vector.
Analysis of the Example • Call map to apply the nested function s_mul to each element in the vector list. • At run time, the interpreted code for sv_mul returns a graph holding the unevaluated expression map s_mul vec. • If we return the routine value map s_mul vec, the activation record of sv_mul will be removed before the nested function s_mul is ever applied.
The Translation of Nested Functions • The functional core supports currying( partial parameterization). • Translating a nested routine f to a global routine is just a matter of extending it with additional parameters p1,p2… pa that capture the out-of-scope pointers; each usage of the nested function f must be replaced with a curried call: f p1…pa
The Translation of Nested Functions • Lift the nested s_mul into a global funciton sv_mul_dot_s_mul • Extend the function heading with an additional parameter named scal capuring the pointer to the scal parameter of the outer sv_mul function. • All calls of s_mul are replaced by the expression sv_mul_dot_s_mul scal.
Conclusion • Short tour of Haskell • General concept of Compiler for Functional programs • Type checking • Desugaring----The most important part • Questions?☺