1 / 23

Specialization Tools and Techniques for Systematic Optimization of System Software

Specialization Tools and Techniques for Systematic Optimization of System Software. Dylan McNamee, Jonathan Walpole, Calton Pu, Crispin Cowan, Charles Krasic, Ashvin Goel, Perry Wagle. Presented by: Rizal Arryadi. Background.

ryder-ward
Download Presentation

Specialization Tools and Techniques for Systematic Optimization of System Software

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Specialization Tools and Techniques for Systematic Optimization of System Software Dylan McNamee, Jonathan Walpole, Calton Pu, Crispin Cowan, Charles Krasic, Ashvin Goel, Perry Wagle Presented by: Rizal Arryadi

  2. Background • Much complexity in OS code arises from the requirement to handle all possible system states. • There are conflict between correctness across all applications vs high performance for individual applications • Micro-kernel approach to OS shows the conflict between performance, modularity, and portability

  3. Approaches • Write general-purpose code, but optimized for a few anticipated common cases • Problem: “common” cases varies • Explicit Customization: incorporate customizability into system structure • SPIN, Exokernel, Synthesis, Mach, etc. • Problem: • Burden in system tuners • Limit access to global system state  optimization opportunities are reduced • Inferred Customization (Specialization): • Automatically derived optimization • Create optimized code for common cases • Restricting code, not extending it

  4. Process Specialization • Also called Partial Evaluation • Consider a program P, taking 2 arguments S and D, producing a result R: run P(S,D) = R A specialization of P wrt S is as follows: run Ps(D) = run P(S,D) = R • Creating optimized code for common cases • Burden in system tuner is reduced • But, more complex analysis of system • Tedious, error-prone • Result is more complex, harder to debug & maintain

  5. Objective of this paper • Provides toolkit to reduce manual work in specialization • Evaluate the toolkit’s effectiveness in operating system components

  6. Fundamentals of Specialization • Specialization Predicates • States of the system known in advance • Partial Evaluation • Given specialization predicates, separate the static parts from the dynamic parts • Guards • Enable/Disable specialized code when specialized terms are modified

  7. Three Kinds of Specialization • Static specialization • Predicates known at compile time • Partial Evaluation can be applied before execution • Dynamic specialization • Defer specialization until runtime • The values of spec predicates are not established until some point during execution • Once established, hold for the remainder of execution • Optimistic specialization • Spec predicates only hold for bounded time intervals (aka quasi-invariants)

  8. Steps to specialize a system • Identify specialization predicates • Use developer’s knowledge • Locate code that can be optimized • Estimate the net performance improvement • Generate specialized code • Use partial evaluation to automate it • Check when specialization predicates hold • For dynamic and optimistic specialization • Locate all places that can cause predicates to change, and “guard” them • Replace specialized code • Replugging: enabling/disabling one version of specialized code with another • Not surprising: synchronization issue

  9. Tempo: specialized code generator • Partial evaluator for C programs • Challenge in binding time analysis • Side effects in C • Pointers & aliases • Structures & arrays • Functions that modify global state • Ignored in conventional approaches but captured well by Tempo • Tempo features: • Use sensitivity: accurate treatment of “nonliftable values” • Flow sensitivity: variables could be static • Context sensitivity: assign specific binding time desc for each context • Return sensitivity: return value of side-effecting functions can be static • Static specialization • Compile-time and run-time specialization

  10. Enabling/Disabling Specialized Code:TypeGuard • Place guards at the site of modifications to specialization predicate terms • To be efficient, the predicate terms must be used more frequently than modified. • Problems: • May include too many sites • Different variables, same name (locally defined) • Aliases  pass by reference • Spec Predicate terms often are not simple scalar, but fields of dynamically allocated structures.

  11. Enabling/Disabling Specialized Code:TypeGuard • Type based static analysis • Two-phase approach: • Phase 1 • Statically identify structure types whose fields are spec predicate terms • Extend them to include spec predicate ID (SPID) • Identify statements that update a guarded field • Insert the guarding code • Phase 2 • Set SPID dynamically when specialized code is enabled • Clear it when specialized code is disabled • Check it when a spec predicate term is modified

  12. Enabling/Disabling Specialized Code:TypeGuard • Guarding example: current->uid = bar would become: if (current.SPID!=NULL) current.SPID->update_uid(bar); else current->uid = bar;

  13. Enabling/Disabling Specialized Code:MemGuard • TypeGuard issues warnings about alias-producing operations, and it needs to be validated. • MemGuard guarantees complete guard coverage • Uses memory protection HW to write-protect pages that contain spec predicate terms • The write-fault handler check if the address being written is a spec predicate term; if so, perform guarded write and might trigger replugging • Uses HW memory protection  guaranteed to capture all writes to spec predicate terms • Drawback: • Coarse granularity • High Overheads

  14. Replugger • Replace current code with code that is consistent with the new state of specialization predicate. • Problem: concurrent replugging and function invocation • Solution: synchronization using locks • Factors affect the design: • Concurrent invocation of the same repluggable function • Can be avoided by associating functions with threads • Concurrency between replugging and invocation • With concurrent invocation (i.e. counting replugger): • Use counter to detect whether threads are exectuing the repluggable function • Use stub function holding_tank to avoid invocation while the function being replugged • Without concurrent invocation: • Use a boolean flag

  15. Experiments:Specializing RPC • Applying Tempo to Sun RPC’s marshaling process • Static specialization the spec predicates are available when stubs are generated. • Specialization Opportunities: • Some fields in data structures used by marshaling code have values known at stub generation time • Some details: • Encoding/Decoding Dispatch Elimination • Buffer Overflow Checking Elimination • Exit Status Propagation • Marshaling Loop Unrolling

  16. Experiments:Specializing RPC (Cont.) • Performance

  17. Experiments:Specializing Packet Filters • BSD Packet Filter (BPF): • Interface for selecting packets from a network interface • Specialization Opportunities: • BPF is executed many times • To be specialized: the packet interpreter • “Specializing an interpreter wrt a particular program effectively compiles that program” • Either static and dynamic specialization • Static: packet filter program is available in advance • Dynamic: if the packet filter program is presented immediately before execution  overhead are included in run-time.

  18. Experiments:Specializing Packet Filters (cont.) • A packet is read and filterd by calling: Parameters: packet filter program, a packet, the length of the original packet, the amount of data present • BPF interpreter: Recursion:

  19. Experiments:Specializing Packet Filters (Cont.) • Performance • Code Size Unspecialized: 550 lines Specialized (6 instr filter): 366 lines Specialized (10 instr filter): 576 lines

  20. Experiments:Specializing Signals • Statement ret = kill (pid, n) • Specialization opportunity: a process might repeatedly sends the same signal to the same destination process  both task_structs won’t change

  21. Experiments:Specializing Signals (cont.) • Optimistic specialization: • if any predicate terms are modified between signals, the specialized code is invalid (e.g. the destination exits) • TypeGuard is used to identify locations that require guarding.

  22. Performance Application Level Impact 100,000 Producer-Consumer iterations Buffer size: 4 Unspecialized: 11.9 s Specialized: 5.6 s Code Size 4 functions into 1, 59 LOC into 18 Experiments: Specializing Signals (Cont.)

  23. Related Work • Multistage programming • Programs that generate other programs • Tempo is a: • Automatic • Heterogenous • Static and Run-time generation • Two or Three Stage system • Aspect-oriented programming “Different aspects of a system’s behavior tend to have their own natural form, so while one abstraction framework might do a good job of capturing one aspect, it will do a less good job capturing others”

More Related