350 likes | 584 Views
A Type-Theoretic Interpretation of Standard ML. Robert Harper and Christopher Stone Presented by Joe Vanderwaart Fall, 2002. One-Slide Summary. Reformulate The Definition of SML, replacing “semantic object” formalism with a typed internal language.
E N D
A Type-Theoretic Interpretation of Standard ML Robert Harper and Christopher Stone Presented by Joe Vanderwaart Fall, 2002
One-Slide Summary • Reformulate The Definition of SML, replacing “semantic object” formalism with a typed internal language. • Get benefits of types for reasoning about programs, and for implementation (e.g., TIL). • Account for all the features of SML: • Sealing provides abstraction. • Dependent types, translucent signatures handle sharing. • Generativity -- well, wait and see. • (Other features less important for modules class.) • Elaboration allows all this with decidable well-formedness.
Re-defining Standard ML • The Definition defines the static semantics of SML using an elaboration relation. • A program is well-formed iff it elaborates to something, by definition. • Target of elaboration is untyped “semantic objects”, including generative stamps. • HS is an alternative to the Definition. • Instead of semantic objects, an internal language (IL) based on Harper-Mitchell and Harper-Lillibridge. • A program is well-formed iff it translates to something.(Thm.: That something will be well-typed.) • Definition and HS don’t agree on everything.
Why do this? I (Joe) can think of these reasons: • SML is big, IL is small. Many SML constructs (e.g. datatypes, exceptions) are really several ideas rolled together. In IL, concerns are separated; no duplicated effort, everything clear. • SML lacks principal signatures, so type-checking is dubious. The avoidance problem makes it hard to see how to type-check SML. HS claim that the IL has all the prinicpal signatures it needs to do SML elaboration.(More on the avoidance problem later.)
The Internal Language Based on H-M, H-L formalisms. • Modules can contain type, value and structure components: mod ::= L | [sbnds] | L sbnds ::= ¢ | sbnds , sbnd sbnd ::= labBvar=con | labBvar=exp | labBvar =mod • Signatures can give definitions for constructors: sig ::= L | [sdecs] | L sdecs ::= ¢ | sdecs, labBdec dec ::= var:con | var:sig | var:knd | var:knd=con | L • Note label/variable distinction!
The IL, Continued • Use the dot notation to extract a member of a structure: con ::= L | modv.lab (why the restriction? -- later.) exp ::= L | mod.lab mod ::= L | mod.lab • As in H-L, must selfify away dependencies before extraction:
The IL: Selfification Module values can be given precise signatures (modv.lab not necessarily a value?)
The IL: Functors • Functors in the IL... mod ::= L | lvar:sig . mod ...have dependent function types, and can be either total or partial. sig ::= L | (var:sig)!sig | (var:sig) !totsig (notation changed for TexPoint reasons...) • Most functors are partial. Total ones are used for datatypes, not so relevant for SML modules. (?)
The IL: Functors (2) Typing rules for functors are familiar: Note: again like H-L, functors must not be dependent when applied.
The IL: Sealing Can seal a module with a signature. What signatures can mod have? • Can forget definitions. • Can forget definitions in substructures. • (But can’t forget components.)
Summary of IL • Essentially, module language is a second-class version of H-L translucent sum formalism. • Familiar features: Definitions in signatures; selfification; subsumption; non-dependency restrictions; sealing
Elaboration • Defined as a bunch of judgment forms; variations on: G`EL-phraseÃIL-phrase : IL-class meaning: EL-phrase elaborates to IL-phrase, which is described by IL-class. • Paper mentions 7 important things that go on during elaboration. • Of these, 4 are important to modules.
Key Aspects of Elaboration • Identifier Resolution Mapping EL identifiers to their IL equivalents. • Signature Matching Has both inclusive and coercive aspects • Sharing Constraints in EL add definitions to IL signatures • Generativity (and the A. P.) Named form like Leroy; renaming for avoidance.
Identifier Resolution Assumptions about variable translation: • Every ML identifier l has a correponding IL label dle (written as overbar in paper). • But there are infinitely many IL labels outside the range of d¢e. • Further, mapping handles “separate namespaces” (i.e., expression vars, type names, signature names, etc.).
Identifier Resolution (2) • In general, a sequence of labels in EL elaborates to a path in IL: Judgment: G`ctxlabsÃpath : class. • Basic rules (for values, types and modules) have the form: • Separate judgment (`sig) for lookup in a signature.
Identifier Resolution (3) • Context can mark a structure as “open”: • To do a long identifier, look in substructure:
Basic Structure Elaboration • For a named module, do context lookup: • For explicit module, elaborate declarations: (Rules for some forms of declarations are complicated.)
Easy Declarations • Structure Declaration (simplified): • Functor Declaration (simplified):
More Declarations • Type Definition (simplified -- monomorphic): (Note definition in sdec gives principal signature) • Open (simplified, note use of “star convention”):
Signature Matching • Check out the opaque signature ascription rule: • `sub judgment is coercive signature matching. “The module mod is the one obtained by dropping and instantiating components from path (which has signature sig) until it matches sig’; and the principal signature of mod is sig’’.”
Coercive Matching • In the interest of time, won’t look at “coercion compilation” rules. • Suffice it to say that the coercion: • Throws away components not mentioned in supersignature. • Instantiates polymorphic components to match less polymorphic specifications. • Generates equality functions for transparent types to comply with eqtype specifications. • Propagates as many definitions as possible (useful for transparent ascription). • Signature subtyping in IL handles forgetting definitions.
Signature Elaboration • Basic rule (typo in paper?): (paper is ambiguous as to whether specs elaborate to sdecs or to sigs.) • Some easy specification rules:
Type Specifications Simplified rules for type specifications: (Actual rules allow polymorphism, multiple “simultaneous” definitions.)
Sharing • Here’s a (simplified) rule for “where type”: • `wt judgment “patches in” the definition for labs’, provided that type component is abstract. • labs’ (def’n of labs) is non-deterministically chosen.
Sharing (2) • “sharing type” is similar, but symmetrical:
Generativity • Opaque ascription modeled by sealing. • Types sealed inside a module can never be judged equal to anything but themselves. • Datatype generativity is captured by elaborating datatypes as abstract types, packaged inside modules and sealed. • But the real module system issue is...
Functor Generativity From the paper: Following Leroy we capture this behavior by imposing the requirement that module expressions be restricted to “named form”. This means that every non-trivial module expression must be bound to a module identifier before it can be used. This restriction is reflected in the grammar by, e.g., the requirement that functor arguments be structure identifiers, rather than arbitrary structure expressions. There is no loss of generality in asuming that programs are written in named form; we can make a prepass [that] introduces bindings for non-trivial module [expressions].
What? • Restriction to named form is reflected in the grammar of the external language. • The “prepass” is part of SML’s semantics, so why isn’t its definition in the paper?! (What if there’s more than one way to do it?) • Since this is handled in the EL, the value restriction on constructor projection in the IL must not be “for generativity”. (I’m guessing it’s just because you don’t want your types to have store or I/O effects.)
Functor Application • Both matching and generativity are happening: • But generativity isn’t really apparent in the rule... It comes from syntax of projecting a type from a struct.
The Avoidance Problem • Not made a big deal of in the H-S paper. • Remember: • Avoidance problem means lack of principal signatures due to variables “leaving” scope • Need principal signatures for every known type-checking algorithm (?) • Want principal signatures for “cleaving” programs at arbitrary module boundaries. • Now, let’s see how to “solve” this “problem”.
Avoidance Problem • Look again at the “open” rule: • Notice: introduces a binding that was not there in the EL. • A similar trick “solves” the avoidance problem: keep bindings around so you don’t have to avoid names!
Avoidance Problem (2) • This trick is at work in the “local” rule: Never need to “avoid” anything, because the binding is really there. In IL terms, “local” does nothing.
Elaboration and Avoidance • Fact: The EL does not have principal signatures. • Fact: The IL does not have principal signatures. (Why not? Could it?) • Claim: All modules produced by the elaborator have principal signatures in the IL. • Terminology: SML has principal signatures, but not syntactic principal signatures. (I, Joe, don’t like this.) • Claim 3 in long paper: All IL judgments considered by the elaborator are decidable. • So: the avoidance problem can be worked around by doing elaboration. (But cleaving still a problem?)
Joe’s Moral: For fairness, be careful about saying a language “has” the avoidance “problem” if it is conceivable that elaboration might solve it.
Summary • HS give an alternative definition of SML by translating into a typed IL. • Gain benefits of types for implementation • Using IL reduces complexity of type theory • Elaboration solves the avoidance problem • Rules are hard to read, because SML is a big language. • Some aspects of theory still imperfect: • Treatment of generativity is ad hoc. • Can a version be done without the A.P.?