480 likes | 623 Views
Languages of the future: mega the 701 st programming language. Tim Sheard Portland State University (formerly from OGI/OHSU). What’s wrong with today’s languages?. The semantic gap What does the programmer know about the program? How is this expressed? The temporal gap
E N D
Languages of the future:mega the 701st programming language Tim Sheard Portland State University (formerly from OGI/OHSU)
What’s wrong with today’s languages? • The semantic gap • What does the programmer know about the program? How is this expressed? • The temporal gap • Systems are “configured” with new knowledge at many different times – compile-time, link-time, run-time. How is this expressed?
What will languages of the future be like? • Support reasoning about a program from within the programming language. • Within the reach of most programmers – No Ph.D. required. • Support all of today’s capabilities but organize them in different ways. • Separate powerful but risky features from the rest of the program, spell out obligations needed to control the risk, ensure that obligations are met. • Provide a flexible hierarchy of temporal stages. Track important attributes across stages.
How do we get there? • In small steps, I’m afraid . . . • Two small contributions • Putting the Curry-Howard isomorphism to work for regular programmers • Exploiting staged computation
Step 1- Putting Curry-Howard to work • Programming by manipulating proofs of important semantic properties • What is a proof? • How do we exploit proofs? • is a new point in the design space somewhere between a • Programming language • A logic
Isabelle Coq Elf NuPurl Alfa We need something in between to two extremes! Haskell Python O’Caml Pascal Java C++ C
Dimensions Formal methods systems • Have too few users. We can’t solve the worlds problems with a handful of users. And, for the most part, the users are “thinkers” not “hackers” • Are used to reason about systems, but aren’t designed to really execute programs. For the most part, they don’t have rich libraries, I/O etc. • Have a steep learning curve. “It takes a Ph.D. to learn to effectively use these tools.”
Between the “concrete” and the “clouds” • Users - Train more users to use formal systems, or add formal features to lower level languages so existing programmers can use formal methods. • Systems – Design practical extensions for formal systems and build robust compilers for them, or add formal extensions to practical languages.
Isabelle Coq Elf NuPurl Alpha Haskell Python O’Caml Pascal Java C++ C
0 is even 1 is odd, if 2 is even, if 3 is odd, if Curry-Howard. What is a proof? Am I odd or even? 3 • Requirements for a legal proof • Even is always stacked above odd • Odd is always stacked above even • The numeral decreases by one in each stack • Every stack ends with 0
0 is even 1 is odd 2 is even 3 is odd 1 – 1 = 0 2 – 1 = 1 3 – 1 = 2
Generalized Algebraic Datatypes • Inductively formed structured data • Generalizes enumerations & tagged variants • Types are used to prevent the construction of ill-formed data and to encode constraints • Pattern matching allows abstract high level (yet still efficient) access • Support the kind of proof construction we’re after
Z :: Even 0 O Z :: Odd 1 E(O Z):: Even 2 O(E(O z)) :: Odd 3 Integer Indexed Type-Constructors Z:: Even 0 E:: Odd m -> Even (m+1) O:: Even m -> Odd (m+1) O(E (O Z)) :: Odd (1+1+1+0) Note Even and Odd are type constructors indexed by integers
GADT’s Generalize this restriction • Data Tree a = Fork (Tree a) (Tree a) | Node a | Tip • Fork :: Tree a -> Tree a -> Tree a • Node :: a -> Tree a • Tip :: Tree a Note the “data” declaration introduces values of a new type Restriction: the range of every constructor matches exactly the type being defined
GADT in mega Zero and Succ encode the natural numbers at the type level kind Nat = Zero | Succ Nat data Even n = Z where n = Zero | ex m . E(Odd m) where n = Succ m data Odd n = ex m . O(Even m) where n = Succ m Even and Odd are proofs!
Z:: Even Zero E:: Odd m -> Even (Succ m) O:: Even m -> Odd (Succ m) Note the different ranges in Z, E and O
The “kind” decl introduces new types • Allow algebraic definitions to define new “kinds” as well as new “types” • Zero and Succ are new types. Kind Nat = Zero | Succ Nat • Zero :: Nat • Succ :: Nat ~> Nat • Succ Zero :: Nat
*2 A hierarchy of values, types, kinds, sorts, … sorts *1 kinds * * * ~> * Nat Nat ~> Nat Int [ Int ] [ ] Zero types Succ 5 [5] values
Why remove the restriction? • The parameter of a type constructor (e.g. the “a” in “T a”) says something about the values with type “T a” • phantom types • indexed types • Consider an expression language: data Exp = Eint Int | Ebool Bool | Eplus Exp Exp | Eless Exp Exp | Eif Exp Exp Exp • What about terms like: • (Eif (Eint 3) • (Eint 0) • (Eint 9))
Imagine a type-indexed Term datatype Note the different range types! Int :: Int -> Term Int Bool :: Bool -> Term Bool Plus :: Term Int -> Term Int -> Term Int Less :: Term Int -> Term Int -> Term Bool If :: Term Bool -> Term a -> Term a -> Term a
Type-indexed Data • Benefits • The type system disallows ill-formed Terms like: (If (Int 3) (Int 0) (Int 9)) • Documentation • With the right types, such objects act like proofs
Type-indexed Terms Data Term a = Int Int where a=Int | Bool Bool where a=Bool | Plus (Term Int) (Term Int) where a=Int | Less (Term Int) (Term Int) where a=Bool | If (Term Bool) (Term a) (Term a) Int :: forall a.(a=Int) => Int -> Term a We can specialize this kind of type to the ones we want Int :: Int -> Term Int Bool :: Bool -> Term Bool Plus :: Term Int -> Term Int -> Term Int Less :: Term Int -> Term Int -> Term Bool If :: Term Bool -> Term a -> Term a -> Term a
Why is (Term a) like a proof? • A value “x” of type “Term a” is like a judgment -| x : a The type systems ensures that only valid judgments can be constructed. Having a value of type “Term a” guarantees (i.e. is a proof of) that the term is well typed.
Programming eval :: Term a -> a eval (Int n) = n eval (Bool b) = b eval (Plus x y) = eval x + eval y eval (Less x y) = eval x < eval y eval (If x y z) = if (eval x) then (eval y) else (eval z)
Problem – Type Checking How do we type pattern matching? case x of (Int n) -> . . . (Bool b)-> . . . What type is x?
Type Checking eval :: Term a -> a eval (Less x y) = eval x < eval y Less::(a=Bool)=>Term Int -> Term Int -> Term Bool x :: Term Int y :: Term Int (eval x) :: Int (eval y) :: Int (eval x < eval y) :: Bool Assume a=Bool in this context
Basic approach • Data is a parameterized generalized-algebraic datatype • It is indexed by some semantic property • New Kinds introduce new types that are used as indexes • Programs use types to maintain semantic properties • We construct values that are proofs of these properties • The equality constrained types make it possible
Constructing proofs • Suppose we want to read a string from the user, and interpret that string as an expression. • What if the user types in an expression of the wrong type? • Build a proof that the term is well typed for the context in which we use it
data Exp = Eint Int | Ebool Bool | Eplus Exp Exp | Eless Exp Exp | Eif Exp Exp Exp test :: IO () test = do { text <- readln ; exp::Exp <- parse text ; case typCheck exp of Pair Rint x -> print (show (eval x + 2)) Pair Rbool y -> if (eval y) then print “True” else print “False" Fail -> error "Ill typed term" }
Representation Types data Rep t = Rint where t=Int | Rbool where t=Bool • “Rep” is a representation type. It is a normal first class value (at run-time) that represents a static (compile-time) type. • There is a 1-1 correspondence between Rint and Int, and Rbool and Bool • If x:: Rep t then knowing the shape of x determines its type, and knowing its type determines its shape.
Untyped Terms and Judgments data Exp = Eint Int | Ebool Bool | Eplus Exp Exp | Eless Exp Exp | Eif Exp Exp Exp data Judgment = Fail | exists t . Pair (Rep t) (Term t)
Constructing a Proof typCheck :: Exp -> Judgment typCheck (Eint n) = Pair Rint (Int n) typCheck (Ebool b) = Pair Rbool (Bool b) typCheck (Eplus x y) = case (typCheck x, typCheck y) of (Pair Rint a, Pair Rint b) -> Pair Rint (Plus a b) _ -> Fail typCheck (Eless x y) = case (typCheck x, typCheck y) of (Pair Rint a, Pair Rint b) -> Pair Rbool (Less a b) _ -> Fail typCheck (Eif x y z) = case (typCheck x, typCheck y, typCheck z) of (Pair Rbool a, Pair Rint b, Pair Rint c) -> Pair Rint (If a b c) (Pair Rbool a, Pair Rbool b, Pair Rbool c) -> Pair Rbool (If a b c) _ -> Fail
Step 2 – Using Staging • Suppose you are writing a document retrieval system. • The user types in a query, and you want to retrieve all documents that meet the query. • The query contains information not known until run-time, but which is constant across all accesses in the document base. • E.g. Width – Indent < Depth && Keyword == “Naval”
Width – Indent < Depth && Keyword == “Naval” • If Width and Indent are constant across all queries, But Depth and Keyword are fields of each document • How can we efficiently build an execution engine that translates the users query (typed as a String) into executable code?
Code in Omega prompt> [| 5 + 5 |] [| 5 + 5 |] : Code Int prompt> run [| 5 + 5 |] 10 : Int prompt> let x = [| 23 |] X prompt> let y = [| 56 - $x |] Y prompt> y [| 56 - 23 |] : Code Int
Dynamic values data Dyn x = Dint Int where x = Int | Dbool Bool where x = Bool | Dyn (Code x) dynamize :: Dyn a -> Code a dynamize (Dint n) = lift n dynamize (Dbool b) = lift b dynamize (Dyn x) = x
translation trans :: Term a -> (Dyn Int,Dyn Int) -> Dyn a trans (Int n) (x,y) = Dint n trans (Bool b) (x,y) = Dbool b trans X (x,y) = x trans Y (x,y) = y trans (Plus a b) xy = case (trans a xy, trans b xy) of (Dint m,Dint n) -> Dint(m+n) (m,n) -> Dyn [| $(dynamize m) + $(dynamize n) |] trans (If a b c) xy = case trans a xy of (Dbool test) -> if test then trans b xy else trans c xy (Dyn test) -> Dyn[| if $test then $(dynamize (trans b xy)) else $(dynamize (trans c xy)) |]
Applying the translation -- if 3 < 5 then (x + (5 + 2)) else y x1 = If (Less (Int 3) (Int 5)) (Plus X (Plus (Int 5) (Int 2))) Y w term = [| \ x y -> $(dynamize(trans term (Dyn [| x |],Dyn [| y |]))) |] -- w x1 -- [| \ x y -> x + 7 |] : Code (Int -> Int -> Int)
Our Original Goals • Build heterogeneous meta-programming systems • Meta-language ≠ object-language • Type system of the meta-language guarantees semantic properties of object-language • Experiment with Omega • Finding new uses for the power of the type system • Translating existing language-based ideas into Omega • staged interpreters • proof carrying code • language-based security
Serendipity • mega’s type system is good for statically guaranteeing all sorts of properties. • Lists with statically known length • Red–Black Trees • Binomial Heaps • Dynamic Typing
Conclusion • Stating static properties is a good way to think about programming • It may lead to more reliable programs • The compiler should ensure that programs maintain the stated properties • Generalizing algebraic datatypes make it all possible • Ranges other than “T a” • “a” becomes an index describing a static property of x::T a • New kinds let “a” have arbitrary structure • Computing over “a” is sometimes necessary
Related Work • Inductive Families • In type theory -- Peter Dybjer • Epigram -- Zhaohui Luo, James McKinna, Paul Callaghan, and Conor McBride • First-class phantom types - Cheney and Hinze • Guarded Recursive Data Types • Hong Wei Xi and his students • Guarded Recursive Datatype Constructors • A Typeful Approach to Object-Oriented Programming with Multiple Inheritance • Meta-Programming through Typeful Code Representation • Constraint-based type inference for guarded algebraic data types -- Vincent Simonet and François Pottier • A Systematic Translation of Guarded Recursive Data Types to Existential Types -- Martin Sulzmann • Polymorphic typed defunctionalization -- Pottier and Gauthier. • Towards efficient, typed LR parsers -- Pottier and Régis-Gianas. • First Class Type Equality • A Lightweight Implementation of Generics and Dynamics -- Hinze and Cheney • Typing Dynamic Typing -- Baars and Swierstra • Type-safe cast: Functional pearl -- Wierich • Rogue-Sigma-Pi as a meta-language for LF -- Aaron Stump. • Wobbly types: type inference for generalised algebraic data types -- Peyton Jones, Washburn and Weirich • Cayenne - A Language with Dependent Types -- Lennart Augustsson
Examples we have done • Typed, staged interpreters • For languages with binding, with patterns, algebraic datatypes • Type preserving transformations • Simplify :: Exp t -> Exp t • Cps:: Exp t -> Exp {trans t} • Proof carrying code • Data Structures • Red-Black trees, Binomial Heaps , Static length lists • Languages with security properties • Typed self-describing databases, where meta data in the database describes the database schema • Programs that slip easily between dynamic and statically typed sections. Type-case is easy to encode with no additional mechanism
Some other examples • Typed Lambda Calculus • A Language with Security Domains • A Language which enforces an interaction protocol
Typed lambda CalculusExp with type t in environment s data V s t = ex m . Z where s = (t,m) | ex m x . S (V m t) where s = (x,m) data Exp s t = IntC Int where t = Int | BoolC Bool where t = Bool | Plus (Exp s Int) (Exp s Int) where t = Int | Lteq (Exp s Int) (Exp s Int) where t = Bool | Var (V s t) Example Type: Plus :: forall s t . (t=Int) => Exp s Int -> Exp s Int -> Exp s t
Language with Security DomainsExp with type t in env s in domain d kind Domain = High | Low data D t = Lo where t = Low | Hi where t = High data Dless x y = LH where x = Low , y = High | LL where x = Low, y = Low | HH where x = High, y = High data Exp s d t = Int Int where t = Int | Bool Bool where t = Bool | Plus (Exp s d Int) (Exp s d Int) where t = Int | Lteq (Exp s d Int) (Exp s d Int) where t = Bool | forall d2 . Var (V s d2 t) (Dless d2 d)
open write Closed Open close Language with interaction prototcolCommand with store St starting in state x, ending in state y kind State = Open | Closed data V s t = forall st . Z where s = (t,st) | forall st t1 . S (V st t) where s = (t1,st) data Com st x y = forall t . Set (V st t) (Exp st t) where x=y | forall a . Seq (Com st x a) (Com st a y) | If (Exp st Bool) (Com st x y) (Com st x y) | While (Exp st Bool) (Com st x y) where x = y | forall t . Declare (Exp st t) (Com (t,st) x y) | Open where x = Closed, y = Open | Close where x = Open, y = Closed | Write (Exp st Int) where x = Open, y = Open
Contributions • Manipulating strongly-typed object languages in a semantics-preserving manner • Implementation of Cheney and Hinze’s ideas in a functional programming language • Demonstration • Show some practical techniques • Logical frameworks ideas translated into everyday programming idioms