290 likes | 386 Views
Empirical Study of Object-layout Strategies and Optimization Techniques. M.Sc. seminar (in the proceedings of ECOOP’2000). Natalie Eckel Supervisor: Dr. Joseph (Yossi) Gil Computer Science Department Technion - The Israel Institute of Technology. Outline.
E N D
Empirical Studyof Object-layout Strategiesand Optimization Techniques M.Sc. seminar (in the proceedings of ECOOP’2000) Natalie Eckel Supervisor:Dr.Joseph (Yossi) Gil Computer Science Department Technion - The Israel Institute of Technology
Outline • Overhead incurred due to multiple inheritance: • VPTRs and VBPTRs • The separate compilation dilemma • Hierarchies used in out experiments • Distribution of object size • Optimization Techniques: • Elimination of transitive virtual inheritance • Inlining virtual bases • Bidirectional layout • Hermaphrodite bidirectional layout • Packing VBPTRs
The Subobject Rule • Basic rule of OO: if class B inherits from class A, then, • Every object of B must have inside it a subobjectof A. • Example (B. Meyer): if SoftwareEngineeris an Engineer then, • There is a part in every software engineer which is an engineer. • Rationale: procedures and methods expecting objects of A, should be able to also operate on an object of type B. Software Engineer Engineer
The VPTR • VPTR: virtual table pointer. • A pointer leading from every object and every subobject to a table of virtual functions (and other RTTI). • Single inheritance: VPTR can be shared between an object, its subobject, its sub-subobject, sub-subobject, etc. • VPTR is laid out at offset 0 • Multiple inheritance: VPTR can only be shared with only one subobject. Software Engineer Engineer VPTR Virtual functions table (VTBL)
The VBPTR Person Teacher Student • VBPTR: virtual base pointer • Answers the question: where is the subobject? • Occurs only in multiple inheritance case. • Rationale: the diamond problem • It is impossible for class Person to have a fixed offset with respect to both Teacher and Student. • Solution: Teaching Assistant VBPTRs Teacher Student TA Person VPTRs
No Dynamic Measurements • Objective: estimate the saving for all possible object sizes • The chicken and egg problem: people may not use MI because of current overhead. • Adds other factors: • Selection of inputs • How to deal with libraries? • Correlated instantiations • Cache • ….
Overheads of Multiple Inheritance • Space Overhead: • VPTR: if a class X inherits from n “roots”, then its objects will have at leastnVPTRs in their layout. • VBPTR: to every “shared” base, usually more than one • Time Overhead: • VPTR: add/subtract offset, i.e., “this adjustment”, in down- and up-casts (not dealt with here). • VBPTR: follow pointers in up-casts. • Inessential VBPTRs(used by some compilers): Add a transitive edge to shortcut every chain of VBPTRs. • Minimizes time overhead. • Induces space overhead.
Compilation Models • Given an inheritance link (a,b), is it • Simple inheritance (no diamonds)? • Virtual inheritance ? (diamond might show up later) • Whole program analysis • the whole picture is available for compilation • the compiler assigns virtual inheritance to solve diamond problems • Separate compilation • the compiler must make the decision without seeing the whole picture • Solution: all inheritance links are treated as virtual • C++ compilation model • user takes the responsibility to assign virtual inheritance • we consider C++ compilers with whole program information
Distribution of Object Size Definition:object size is the total number of compiler generated fields in the layout of objects of a certain class
Cost of Using Separate Compilation Over C++ Compilation Model
Elimination of Transitive Virtual Inheritance • A preliminary step to more sophisticated techniques • Can be done in any compilation model V V this edge is transitive! v B B v A A
The Efficacy Definition:efficacy of optimization technique for a certain class is the relative reduction in object size for a class due to application of the technique Definition: accumulative efficacy=(x,y)means that x% of classes experience at least y% reduction in their object size
Efficacy of Elimination of Transitive Virtual Inheritance • Eliminates 4.1% of inheritance links • Reduces the faction of virtual inheritance links from 35.2% to 28.6% • Accumulative efficacy=(8%,8%)
Inlining of Virtual Bases X • Inlining: Layout a virtual base inside a child, thus eliminating at least one VBPTR. • Has a potential of saving a VPTR. • A virtual base can be inlined into several children, as long as the shared inheritance semantics is obliged. • Not without whole program analysis! Must examine descendants! • Can we inline X into Y? • No! But we have to see Z to understand why: • Due to the repeated inheritance semantics of C++, class Z has twoY objects in it. • If Y has X inlined into it, then there would be two copies of X in Z, which contradicts the C++ semantics v Y W Z
V Inlining Techniques A D B C E • Devirtualization of single virtual inheritance • V is inlined into E • Simple Inlining • Devirtualization + inline into one child • V is inlined into E and either A, B, C or D • Aggressive Inlining • Find a maximally independent set of children to inline into • Classes are independent if they don’t share a descendant • V is inlined into E , either A or B , either C or D F G
Efficacy of Inlining Techniques Simple Inlining vs. Aggressive Inlining
Bidirectional Object Layout • Idea: use both ascending and descending memory addresses for object layout • One VPTR can be saved in a marriage of a “positive” and a “negative” class • C has mixed directionality Standard layout: Bidirectional layout: A- B+ A B C A- B+ C C
Bidirectional Layout of Virtual Functions Table • The Virtual Function Table must also have a directionality. • Positive classes: entries 0,1,2,… • Negative classes: -1, -2, …. A- B+ C -3 -2 -1 0 1 2 3 4 A’s virtual table B’s virtual table Functions introduced in C
The Theorem of Marriage • The BIG question: how to assign directionality to classes to maximize savings? • Whole program analysis: various algorithms and heuristics possible • Separate compilation: assign directionality at random! (actually use a good hash function) • The theorem of marriage: With random assignments, a class that has n roots will enjoy an expected saving of at least: n/2/2 n/4. In other words, about half of all root classes will eventually find a mate.
Marriages of Non-Virtual and Virtual Bases • Ones classes A and B are married in C, they remain married in all C’s descendants • However, marriage of virtual bases cannot be permanent. • V1 and V2 are married in A • V2 and V3 are married in B • What happens in C? • Each class marries its virtual bases independently of what its ancestors did • Theorem: If there are n virtual base classes, then the number of marriages is n/2 - O(n) • that’s the expectation for separate compilation model V1+ V2- V3+ A B C
Bidirectional Layout Efficacy Separate compilation without inessential VBPTRs C++ compilation model with inessential VBPTRs • Applied after Aggressive Inlining • Big objects have 20% of their size occupied byVPTRs • 5% savings for big objects – a quarter of VPTRs as predicted • (30%,30%) • The number of VPTRs and VBPTRs is about the same • 15-20% for big objects – almost a half of the VPTRs as predicted • (60%,18%)
Hermaphrodite Bidirectional Object Layout • Bidirectional layout drawback: two base classes with the same directionality will never be married • Hermaphroditing: a directed (hermaphrodite) class has two types of instances: “positive” and “negative” • Two hermaphrodite classes can always be married
Efficacy of Hermaphrodite Bidirectional Layout C++ compilation model with inessential VBPTRs Separate compilation without inessential VBPTRs • (33%,33%) • Applied after Aggressive Inlining • (50%,25%) • Makes savings for all classes of size 2 and more!
Packing VBPTRs • Observation: objects are laid out consecutive in memory • Motivation: In large objects VBPTRs occupy 80-90% of their size • Idea: instead of using full blown pointers to virtual base sub-objects, use offsets • Assumption: machine word = 4 bytes • Small objects (under size 1K): an offset to a sub-object can be stored in one byte = 4 offsets in a word • Larger objects (under size 0.25MB): an offset could be stored in 2 bytes = 2 offsets in a word • Class can reuse empty “slots” in non-virtual bases • Cannot reuse empty slots in virtual base sub-objects
Efficacy of Packing in C++ Compilation Model 2 slots in word 4 slots in word • Expected savings: • 4 slots in word: saves 60-70% in object size • 2 slots in word: saves 40-45% in object size
Summary • Evils of virtual inheritance and different compilation models. • Distribution of object size votes against separate compilation. • Optimization techniques: • Inlining (not so trivial). • Aggressive inlining. • Bidirectional layout. • Architectural support. • Hermaphroditing idea • Secure savings for all sizes of objects • Possible run-time costs for checking the instance directionality • PackingVBPTRs • The bottom line: saving in the range of 40% can be achieved for all object sizes!!!
Future Research • Dynamic measurements • More optimization techniques • Efficient implementation of Java interfaces