330 likes | 470 Views
3. 3. 3. 4. 4. 4. 5. 5. 5. 3. 3. 4. 4. 5. 5. 3. 3. 3. 4. 4. 4. 5. 5. 5. A Type Theory for Memory Allocation and Data Layout. Leaf Petersen, Robert Harper, Karl Crary and Frank Pfenning Carnegie Mellon University. Views of data. High-level languages
E N D
3 3 3 4 4 4 5 5 5 3 3 4 4 5 5 3 3 3 4 4 4 5 5 5 A Type Theory for Memory Allocation and Data Layout Leaf Petersen, Robert Harper, Karl Crary and Frank Pfenning Carnegie Mellon University
Views of data • High-level languages • Abstract view of data, characterized by operations • e.g. pairs: • Introduction: (e1,e2) : t1 x t2 • Elimination: fst e : t1 , snd e : t2 • Low-level languages • Concrete view of data, characterized by layout in memory • e.g. C structs: • Contiguous layout • Memory size determined by type Carnegie Mellon University
Data layout • Usually programmers don’t care • But sometimes have to • Marshalling, interaction with low-level devices, precise control of initialization, interoperability • Generally no type safety • Compilers have to care • Represent high-level data abstractions • Allocation and initialization code Carnegie Mellon University
3 4 5 3 4 5 3 4 5 (3,(4,5)) : int x (int x int) Carnegie Mellon University
Type theory for data layout • Expose the fine structure • Expose memory layout in types • Implementation choices explicit • High-level object types defined in terms of low-level memory types • High-level operations on objects broken down into low-level operations on memory • What is the fine structure of memory? Carnegie Mellon University
Initialization • Data objects • Created by initializing raw memory. • Initialization changes types • e.g. from ns to int • Commonly dealt with via linearity • New memory is linear • No aliases • Linear type theory handles re-typing Carnegie Mellon University
3 4 Adjacency • Memory provides a primitive notion of adjacent items: e.g. 3 next to 4. • Large objects composed of adjacent smaller objects • Sub-objects referenced by offsets or interior pointers. Carnegie Mellon University
3 4 5 Associativity • Adjacency is associative: the same memory layout is described by: • (3 next to 4) next to 5 • 3 next to (4 next to 5) • But not commutative! • 3 next to 4 ¹ 4 next to 3 Carnegie Mellon University
4 5 3 Indirection • Not all objects are adjacent • Memory supports a notion of indirection (pointers or labels). • Refer to non-adjacent data via indirection • 3 next to (pointer to (4 next to 5)) Carnegie Mellon University
Ordered Type Theory • Linear type theory handles initialization • Doesn’t capture other memory properties • Ordered type theory • Variables used exactly once (linear) • Variables may not be permuted. • Adjacent variables remain adjacent • No weakening, contraction, or exchange. • Claim: Ordered constructs admit a natural interpretation as adjacency and indirection. Carnegie Mellon University
Variables and Resources • Typing judgments: • Ordering of x’s does not matter. • Unrestricted variables, bound to small objects • Ordering and usage of a’s does matter. • Bound to memory • Adjacent variables bound to adjacent memory Carnegie Mellon University
Ordered product • Ordered product (fuse): • Ordered products model adjacency Carnegie Mellon University
3 4 3 4 5 Adjacency • 3 next to 4 • 3 ² 4 : int ² int • 3 next to 4 next to 5 • 3 ² (4 ² 5) : int ² (int ² int) • (3 ² 4) ² 5 : (int ² int) ² int Carnegie Mellon University
Memory properties • Associativity: • (t1²t2) ²t3 and t1² (t2²t3) are isomorphic • Functions witness isomorphism • Non-commutativity: • t1²t2 and t2²t1 are not isomorphic • No function mapping one to the other (in general) Carnegie Mellon University
Indirection • Ordered modality models indirection • !M : !t corresponds to a pointer to M • Non-linear, un-ordered term Carnegie Mellon University
3 4 5 3 4 5 3 4 5 (3,(4,5)) : int x (int x int) Carnegie Mellon University
3 4 5 (3,(4,5)) : int x (int x int) int x (int x int) Ã !(int ² !(int ² int)) (3,(4,5)) Ã !(3 ² ! (4 ² 5) ) Carnegie Mellon University
3 4 5 (3,(4,5)) : int x (int x int) int x (int x int) Ã ! (! int ² !(! int ² ! int) (3,(4,5)) Ã !(!3 ² ! (!4 ² !5)) Carnegie Mellon University
3 4 5 (3,(4,5)) : int x (int x int) int x (int x int) Ã ! (int ² (int ² int)) (3,(4,5)) Ã !(3 ² (4 ² 5)) Carnegie Mellon University
Explicit Allocation • Ordered type theory • Fine structure of data layout • But not allocation • For example: !(x ² x) • Each time x is instantiated, new object • Initialized atomically • Make allocation explicit • Remove !M from syntax • Add allocation primitives to introduce !t Carnegie Mellon University
Memory Allocation • A well-known GC allocation protocol for copying garbage collectors: • Reserve: obtain raw, un-initialized space. • Initialize: assign values to individual locations. • Allocate: baptize some or all as valid objects. Carnegie Mellon University
Example: Memory Allocation Allocate Initialize Reserve Heap 1 2 0 ? ? ? ? x AP AP LP x = (0,(1,2)) Carnegie Mellon University
Memory Allocation • Type system separates terms and expressions • Terms M: no effects • Expressions E: have effects • Allocation is an effect • Allocation primitives are expressions Carnegie Mellon University
Allocating a Pair Create names for parts. Resource a is used up! Initialize a1, using it up. Reserve space at a. Re-introduce b1:int • Allocate (1,2): Fuse parts and allocate. Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Coalescing Reservation Allocate two pairs: (1,2) and (3,4) Carnegie Mellon University
Summary • Type theory for describing data layout • Adjacency requirements. • Precise control over representations. • Type system for allocation: • Allocate raw memory. • Initialize, destructively changing types. • Ensures correct use of allocation protocol. • Permits code motion optimizations. Carnegie Mellon University
What I’m not telling you • It’s more subtle than it seems. • Plain ordered l–calculus doesn’t work. • Need notion of size preserving terms, other refinements. • For details see the paper • Technical presentation and examples. • Interpretation of a l-calculus with pairs. Carnegie Mellon University
Current and Future Work • POPL paper • Only finite products • Technical Report: • Sums, recursive types, ordered functions. • Extended coercion language. • Ongoing • Dynamic extent (arrays) • Other allocation models Carnegie Mellon University
Conclusion • Ordered type theory is a natural framework for modeling data layout. • Low level issues dealt with entirely realistically in a l-calculus setting. • Correctness of allocation and initialization protocols can be captured in the type system Carnegie Mellon University