750 likes | 880 Views
Type Systems for Modularity. Robert Harper Fall Semester, 2002. This Course. Graduate reading seminar. Pre-requisites: PL core or permission. Experience with SML or O’Caml. Web page: http://www.cs.cmu.edu/~rwh/courses/module systems Course format: Students present papers.
E N D
Type Systems for Modularity Robert Harper Fall Semester, 2002
This Course • Graduate reading seminar. • Pre-requisites: PL core or permission. • Experience with SML or O’Caml. • Web page:http://www.cs.cmu.edu/~rwh/courses/module systems • Course format: • Students present papers. • I will go first.
This Course • Expectations: • The class works iff everyone contributes equally and actively. • It is imperative that you come to class prepared by having read the relevant papers for that day. • Offenders will fail the course. • If you don’t have time, don’t take the course.
Goals • Goal: develop the type theory of module systems. • Surprisingly intricate! • Confluence of many important issues. • Methodology: typed l-calculus. • Declarative type system. • Scalable. • Supports type-based implementations.
Name Space Management • Divide program namespace into segments. • Avoid cluttering global name space. • Facilitate team development and code re-use. • Generalization of lexical scoping. • Name resolution relative to a module.
Namespace Management • Supported by most languages: • Packages in Java, Lisp. • Modules in Modula-2, -3. • Structures in ML.
Namespace Management structure Url = structval push = …val pull = … end structure Stack = structval push = … end
Namespace Management structure S = structtype elt = …val enq = …val deq = …val null = … end
Types and Values • Structures in ML contain type definitions and value definitions. • Functions, exceptions are certain values. • Datatype’s are modules. • Constituents are accessed by paths. • S.x the value component x of structure S • S.t the type component t of structure S • Types now involve modules!
Hierarchical Structure • Divide large programs into relatively independent “chunks”. • Structure programs as trees or dag’s of components. • Isolate components within other components to enforce locality.
Hierarchical Structure structure Thread = structstructure TQueue = struct val deq = …endval yield = …val spawn = … end
Hierarchical Structure • Paths are extended to navigate through sub-structures: • S.T.U.x the value x in U of T of S • S.T.t the type t in T of S • Paths are also called long names.
Interface Ascription • Ascription = associating an interface with a module. • Descriptive: characterize visible properties of a module. • Just a “sanity check”. • Prescriptive: characterize and limit visible properties of a module. • Imposes restrictive “view”.
Interface Matching • When does one interface match another? • When does I satisfy J’s requirements? • Structural: interfaces match based on strength of requirements. • To match means to fulfill requirements. • Flexible, supports re-use. • But can incur “unintended” matches.
Interface Matching • Nominal: interfaces match only if explicitly declared to do so. • Inflexible, requires revision. • But avoids accidental matches. • Structural seems more practical. • Cannot anticipate the future.
Java Interfaces • Java interfaces are nominal, descriptive. • Must explicitly declare the interfaces a class can have. • Interfaces describe behavior, not code. • Java interfaces are not types. • Really “fully abstract” classes with special hacks for “multiple inheritance”. • Fundamentally a kludge.
ML Signatures • ML signatures are structural, prescriptive. • Signatures require that a module have certain components with specified properties. • Types of values. • Type equality relationships. • Signature ascription limits the client’s view of a module to what is specified. • Transparent: propagate type definitions. • Opaque: no implicit propagation.
ML Signatures signature SIG = sigtype tval trans : t -> tval out : t -> intval in : int -> t end structure S : SIG = structtype t = int*intfun trans(x,y) = (y+1,x)fun out(x,y) = yfun in(x) = (x,0)fun f(x,y) = (x+1,y+1) end
ML Signatures signature SIG = sigtype tval trans : t -> tval out : t -> intval in : int -> t end structure S :> SIG = structtype t = int*intfun trans(x,y) = (y+1,x)fun out(x,y) = yfun in(x) = (x,0)fun f(x,y) = (x+1,y+1) end
Principal Signatures • The principal signature of a module completely characterizes the compile-time significance of that module. • Every other signature is a weakening of the principal signature. • Type checkers work by computing principal signatures. • It is never obvious whether a module system has principal signatures (often they do not).
Signature Matching • ML defines a sub-signature pre-ordering S·S’ stating that S is stronger than S’. • At least as many components with the required types. • At least as much sharing. • Ascription: check that the principal signature matches the target signature. • Principal = least wrt to this ordering.
Parameterization • Instantiate generic modules. • Abstract common patterns. • Support reusable libraries. • Separate compilation is a form of parameterization. • Client parameterized on provider.
Parameterization • Few languages, apart from ML, support parameterized modules. • Can be hacked around using class loaders, but not easily or as a linguistic mechanism. • Functors are “first-order” parameterized modules. • Take and yield structures. • Some kind of “function”, but what kind?
ML Functors signature ORDSET = sigtype tval leq : t * t -> bool end functor StringDict(structure Key : ORDSET) = structtype dict = … Key.t …fun insert (x, k, d) = … end
Dependency • Hierarchy and parameterization introduce dependent signatures. • Signature of a structure may refer to types in a substructure. • Result signature of a functor may refer to types in the argument. • Key: Signatures depend on structures.
Dependency signature THREAD = sigtype threadstructure TQ : sig type queue val deq : queue -> threadendval block : thread * TQ.queue -> unit end
Dependency • Result signature of a functor typically depends on the argument. • Eg, key type of dictionary is the carrier of the ordered set. • Need dependent functor signatures to be fully expressive. • Characteristic of ML modules.
Dependency functor Dict(structure Key : ORDSET structure Elt : SET) :>sig type dict val insert : Key.t * Elt.t * dict -> dict … end
Dependency • How to define dependent signatures in isolation? • Contains a “free reference” to the type component of some structure. • Two main solutions: • Parameterized signatures [Haskell]. • Sharing specifications/type definitions [ML]. • Also transparent ascription.
Dependency signature QUEUE = sigtype elttype queueval insert : elt * queue -> unit end signature THREAD = sigtype threadstructure TQ : QUEUE where type elt = thread… end
Dependency signature DICT = sigtype elttype keytype dictval insert : key * elt * dict -> dict… end
Dependency functor Dict(structure Key : ORDSET structure Elt : SET):> DICT where type elt = Elt.t and type key = Key.T = struct … end
Dependency • Using where type is similar to instantiation of a parameterized signature. • A parameterized signature would take types as arguments. • But we avoid having to make a decision in advance about which types are parameters and which are results! • Can change from one situation to the next. • Awkward to work with parameterization and instantiation.
Data Abstraction • Concrete = public representation with no restrictions on use. • Abstract = private representation usable only via specific public operations. • Want programmer control over whether a type is abstract or concrete. • All or none is too inflexible.
Data Abstraction • Abstract types are opaque. • No revelation of type identity. • Concrete types are transparent. • Type identity is revealed. • Translucent = opaque + transparent. • Under programmer control.
Data Abstraction • Type definitions in a signature reveal type identity. • Transparent ascription (:) augments target signature with definitions of all opaque types. • where type adds “new” definitions to an “old” signature. • A form of specification inheritance.
Data Abstraction signature BIGNUM = sigtype Tval from_int : int -> Tval * : T * T -> T… end structure BigNum :> BIGNUM = …
Data Abstraction signature GROUP = sigtype Tval e : Tval inv : T -> Tval * : T * T -> T End structure Z :> GROUP = structtype t = intval e = 0val inv = ~val * = op + end
Data Abstraction signature GROUP_Z =GROUP where type T = int structure Z :> GROUP_Z = … structure Z : GROUP = …
Data Abstraction signature SS_DICT = sigtype elt = stringtype key = stringtype dict… end structure SS_Dict :> SS_DICT = … structure SS_Dict : DICT = …
Two Forms of Sharing • Type sharing: equate abstract types without revealing their representation. • Two “views” of the same abstract type. • Symmetric specification of relationship. • Module sharing: equal modules have the same implementations. • Specify interpretation of a type without imposing abstraction.
Type Sharing signature IPS = sigstructure S : FIELDstructure V : VECTORsharing type S.t = V.scalarval dot : V.t * V.t -> S.t end
Structure Sharing • Originally Standard ML supported structure sharing. • Each structure had a unique identity. • Could insist that two structures have the same identity. • But this was dropped, to simplify the language. • O’Caml never had it.
Separate Compilation • Implementation-on-interface dependency. • Client sees only the interface of the provider. • Recompile client only if interface changes. • Implementation-on-implementation dependency. • Client sees the code of the provider, so cannot be separately compiled • Typified by inheritance: fragile base-class (fbc) problem.
Separate Compilation • Desirable: arbitrary choice of separated or integrated compilation of modules. • “Cleave” the program at any module boundary. • Requirement: fully syntactic interfaces. • Must capture full static significance of a module in an interface. • The principal signature of a module must be syntactically definable in the language.
Incremental Recompilation • On-implementation dependencies and non-syntactic interfaces prohibit separate compilation. • Cannot cut dependency by an interface. • Alternative: incremental recompilation. • Believe the interface, if present. • Elaborate the code if no interface present. • Do the “least” amount of work.
Separate Compilation and Incremental Recompilation • SML/NJ CM supports IR. • CM files describe makeup of system. • Automatic dependency analysis. • Not really possible, so restrictions are made. • TILT and O’Caml support SC and IR. • Cannot always cleave program, but respects a given cleavage.
First- vs Second-Class Modules • First-class is not always better than second class! • Must “assume the worst”, which reduces flexibility and expressiveness. • But fcm’s can be stored away, passed around at will. • No restriction on their role as values. • Can conditionally compute modules based on run-time values.
First- vs Second-Class Modules • Second-class is not always better than first class! • Cannot store modules or create them based on run-time conditions. • But unknown modules are relatively benign. • Stronger type sharing relationships.
The Phase Distinction • Two phases: • Compile time (static) • Run time (dynamic). • Static type checking = type checking without testing code equality. • Essential for decidability. • Otherwise the language “does not respect the phase distinction”.