1.11k likes | 1.28k Views
Hardware Support for Efficient Transactional and Supervised Memory Systems. Jayaram Bobba Dissertation Defense 1/14/2010. Overview: 1) Research Area 2) Challenges/ Contributions 3) Big Picture. Dept. of Computer Sciences University of Wisconsin–Madison. Research Area. Device Scaling.
E N D
Hardware Support for Efficient Transactional and Supervised Memory Systems JayaramBobba Dissertation Defense 1/14/2010 Overview: 1) Research Area 2) Challenges/ Contributions 3) Big Picture Dept. of Computer Sciences University of Wisconsin–Madison
Research Area Device Scaling • Emergence of CMPs • Hard to Program Abundant Transistors Hardware Support to Improve Productivity Empty/full-bits Transactional Memory MemTracker Supervised Systems Deterministic Memory Wisconsin Multifacet Project
Challenges • Supervised Systems • Sequential-consistency only • Ad hoc hardware • Lack of formalism • Transactional Memory • “Most transactions are small” • Self-fulfilling • Limited applicability • Contribution 1: • Supervised Memory • TSOdata,Safe Supervision • Contribution 2: TokenTM Contribution 3: StealthTest Wisconsin Multifacet Project
Big Picture Applications Software Tools • StealthTest Supervised Systems • TokenTM • TSOdataand • Safe Supervision Supervised Memory Hardware Wisconsin Multifacet Project
Outline Slide Count • Motivation • Supervised Memory • TokenTM • StealthTest • Conclusion Wisconsin Multifacet Project
On Software Productivity More Productivity Yannis’s “Law”: Programmer Productivity doubles every 6 years More Performance Moore’s Law Moore’s Law will continue But Yannis’s Law? Wisconsin Multifacet Project
What has changed? • “A Fundamental Turn towards Concurrency in Software” [Herb Sutter, 2005] • Moore’s Law -> Better Computers • Sequential Computers (Past) • Memory wall, Power wall etc. • Attack of the killer CMPs* (Current) • How to program? Expose parallelism to software • Parallel programs hard to write * Adapted from “Attack of the killer micros” by Eugene Brooks Wisconsin Multifacet Project
Who solves the productivity issue? • Why, Of course, hardware architects! • Long live Moore’s Law • Spend some transistors on productivity issues • Architectural Support for Enhancing Productivity • for language features • for bug avoidance • for debugging • for performance feedback • and so on… Wisconsin Multifacet Project
Seriously, Who should solve it? • HW Architects or SW Engineers? • ‘software crisis’ in the past too… • Why HW architects? • More bang for the buck (Economic) • Software/IT (1,152 billion) vs Hardware (138 billion) [Wen Mei Hwu, Micro-39 Keynote] • SW cannot do it alone (Technical) • Decades of automatic parallelization efforts • Virtual Memory, Tagged Memory for LISP-like languages “We must now reconsider the balance of hardware and software and to provide more specialized function in hardware than we have previously, in order to drastically simplify the programming process” Edward A. Feustel, IEEE TOC, July 1973 in support of Tagged Memory Wisconsin Multifacet Project
Outline • Motivation • Supervised Memory • Background/Motivation • Explore relaxed supervised systems • Define Supervised Memory • Propose formal models • TokenTM • StealthTest • Conclusion Wisconsin Multifacet Project
Why Supervised Systems? • Synchronization • Hardware TM systems • Empty/Full-bits • [Berry et al 2006] Graph processing algorithms on 4 processor MTA > 64K BG/L • Controlled non-determinism • Deterministic/Interleaving Constrained Multiprocessing • Debugging • Log-based architectures • Safety • Heap checkers, Bounds checkers • Language Features • Hardware-assisted Garbage Collection Wisconsin Multifacet Project
What are Supervised Systems? • out-of-band metadata per data block • monitor & control (supervise) memory accesses to data • execute handlers on specific metadata states • pure software possible, but inefficient • shadow memory E.g., Valgrind. Mean Slowdown 22X [Nethercote et al., VEE2007] Wisconsin Multifacet Project
State-of-the-Art • Expect Sequentially-Consistent (SC) hardware • Most hardware is not • Ad hoc • Whither primitives? • Informal treatment of memory consistency • Ambiguous/Incorrect Wisconsin Multifacet Project
Contributions • Expect Sequentially-Consistent (SC) hardware • Most hardware is not • Ad hoc • Whither primitives? • Informal treatment of memory consistency • Ambiguous/Incorrect Explore relaxed supervised systems Define Supervised Memory Propose formal memory models Wisconsin Multifacet Project
Outline • Motivation • Supervised Memory • Background/Motivation • Explore relaxed supervised systems • Define Supervised Memory • Propose formal models • TokenTM • StealthTest • Conclusion Wisconsin Multifacet Project
Explore relaxed supervised systems TSO-lite: A TSO-compliant system PC PC ST 0x01, A ST 1, [A] LD [B], r1 ST 2,[C] LD [C], r3 Processor ST 0x10, C 0x01 STA LDB r1 r1 r2 r2 r3 r3 0x10 Store Buffer Memory Wisconsin Multifacet Project
Explore relaxed supervised systems Empty/Full-Bits on TSO-lite PC PC ST 0x01, A ST 1, [A] LD [B], r1 ST 2,[C] LD [C], r3 ST Processor ST 0x10, C 0x01 r1 r1 Full Empty r2 r2 r3 r3 LD Store Buffer I1: NO LOAD BYPASS ST LD Exception EXCEPTION LD/ST None Memory I2: LATE EXCEPTIONS Wisconsin Multifacet Project
Explore relaxed supervised systems Deterministic Shared Memory (DMP)[Devietti et al., ASPLOS 2009] “depending upon the consistency model of the underlying hardware, threads must perform a memory fence at the edge of a quantum” • Insert a fence after the last operation in the quantum • Insert a fence before the first shared operation in the quantum I3: Reordered metabit-reads Wisconsin Multifacet Project Illustration
Outline • Motivation • Supervised Memory • Background/Motivation • Explore relaxed supervised systems • Define Supervised Memory • Propose formal models • TokenTM • StealthTest • Conclusion Wisconsin Multifacet Project
Define Supervised Memory What is Supervised Memory? • Each memory location A, • data (A.d) • metadata (A.m) • New operations • Supervised Load (sLD A) • Supervised Store (sST A) • Jump on reading special metadata (Optionally) • Hardware exception Wisconsin Multifacet Project
Define Supervised Memory Supervised Operations sLD A => Start: atomic{ curm = Val[RA.m] // Read metadata nextm = NEXT(Load, curm) // Check software- // specified FSM If nextm == EXCEPTION then Jump to Handler RA.d // Read data If (nextm != curm) then WA.m,nextm // Update metadata } Handler: … Wisconsin Multifacet Project
Define Supervised Memory Using Supervised Memory • Software assigns semantics to metadata • Metastates stored as metadata • E.g., Initialized, Uninitialized • Metastate transition function (NEXT) • Use supervised operations to monitor/control data operations • E.g., catch read access to uninitialized data Wisconsin Multifacet Project
Outline • Motivation • Supervised Memory • Background/Motivation • Explore relaxed supervised systems • Define Supervised Memory • Propose formal models • TokenTM • StealthTest • Conclusion Wisconsin Multifacet Project
Propose formal models TSO Axioms [Hangal et al., ISCA 2004] Wisconsin Multifacet Project
Propose formal models TSO Axioms [Hangal et al., ISCA 2004] Reordering Axioms Rd A Rd B Rd A Wr B Wr A Wr B Wr A Rd B Allows store buffers Wisconsin Multifacet Project
Propose formal models TSOall: A Consistency Model for Supervised Memory TSO axioms applied to all accesses—data and metadata + (Simple) Like TSO — (Slow) Prohibits optimizations Thread: sST A sLD B => Store buffers ineffective • Tension • Ease of Reasoning vs Performance ->[Rd A.m, WrA.d, WrA.m] ->[Rd B.m, Rd B.d] Wisconsin Multifacet Project
Propose formal models Blast from the Past[Adve and Hill, ISCA1990] • Ease of Reasoning (SC) vs Performance (RC) • Observation: • Simple programs rely only on certain SC orders • Ignore non-essential orders. Still appears as SC • Challenge:Simple? Non-essential orders? • Solution:Data-race-freedom • For data-race-free programs, RC = SC Wisconsin Multifacet Project
Propose formal models Safe SupervisionMotivation • Ease of Reasoning (TSOall) vs Performance (?) • Observation: • Simple supervised programs rely only on certain TSOall orders • Ignore non-essential orders. Still appears as TSOall • Challenge: Simple? Non-essential orders? • Solution: Safe Supervision • For safely supervised programs, ? = TSOall Wisconsin Multifacet Project Examples
Safe Supervision • metadata accesses to location A not used to order operations to a different location B • Most uses of supervision are safely supervised. E.g., • Heap Checker: Initialized/Uninitialized values • Transactional Memory: Conflict Detection information Initially, A.m = Empty, B.d = 0 Thread 1: B.d = 1 A.m = Full Thread 2: While (A.m == Empty); Read B.d Wisconsin Multifacet Project Definition
Propose formal models TSOdata: Fast Yet Simple Thread: sST B sLDA Reordering Axioms ->[Rd A.m, WrA.d, WrA.m] • Store buffers can • be used ->[Rd B.m, Rd B.d] • For safely supervised programs, TSOdata = TSOall Wisconsin Multifacet Project
TSOdata on OpenSPARC T2 • Goal: Explore low-level issues on a real design • Late Exceptions with deferred handlers • Dump store buffer entries on exception • Enhance store buffer to carry Virtual Address (VA) • ~200 cycles to read out 4 entries • Disable store buffer bypassing for supervised loads • Low space overhead for adding metabits (~4%) Wisconsin Multifacet Project
Supervised Memory Summary • Expects Sequentially-Consistent (SC) hardware • Most hardware is not • Ad hoc • Whither primitives? • Informal treatment of memory consistency • Ambiguous/Incorrect Explore relaxed memory systems Define Supervised Memory Propose formal memory models Wisconsin Multifacet Project
Outline • Motivation • Supervised Memory • TokenTM[ISCA 2008] • StealthTest • Conclusion Longer Version Wisconsin Multifacet Project
TokenTM Summary • Current Hardware TMs • Most Transactions Small & Short Running • Penalize large/long transactions • Too restrictive for wide-spread TM use? • Hypothesis • Must Support Efficient Large/Long Transactions As Well • Is such an HTM even possible? • Yes! TokenTM 1. LogTM’s Log to buffer unbounded values 2. Transactional Tokens for unbounded conflict detection • Conflict state in memory metabits Wisconsin Multifacet Project
Transactional Tokens • Challenge: How to efficiently track Read/Write sets? • Token Coherence [Martin03] • Read/Write sets for cache coherence • Solution: Transactional Tokens • T tokens per memory block • At least one token to read, All T tokens to write (token conflict detection) • Token Metadata <c0,c1,…,ci,…>where 0≤ci≤Tis count of tokens held by thread with TID i. Wisconsin Multifacet Project
Tokens and Supervised Memory • Challenge:Where to store Unbounded, Globally Accessible Token Metadata? • unbounded and globally accessible • Solution • Supervised Memory’s Metadata • Piggyback on existing Virtual Memory and Cache Coherence mechanisms Skip Animation Wisconsin Multifacet Project
TokenTM: a Large-Transaction TM • New Conflict Detection Mechanism • Transactional Tokens in Supervised Memory • Token Coherence [Martin03] at different level • Version Management • Save old/new values for unbounded Write set • LogTM [Moore06] undo log Wisconsin Multifacet Project
Outline • Motivation • Supervised Memory • TokenTM • StealthTest [PACT 2009] • Conclusion Wisconsin Multifacet Project
StealthTest Summary (1/2)The Problem: fork Overhead • Software testing hard • Multithreading makes harder • Online software testing can help • Run tests on deployed software E.g., Delta Execution for patch testing [Tucek et al., ASPLOS 2009] • Non-intrusive mechanisms • fork(existing) Low Overhead Functionally Hidden Good Scaling fork Wisconsin Multifacet Project
StealthTest Summary (2/2)Solution: TM for testing • Leverage Transactional Memory for online testing • Non-Intrusive? • transaction { test(); abort} • Fast TM mechanisms Low Overhead Functionally Hidden Good Scaling • Demonstrate two uses • Delta Execution • In vivo Testing StealthTest Wisconsin Multifacet Project
Outline • Motivation • Supervised Memory • TokenTM • StealthTest • Online Software Testing • E.g., Patch Validation • StealthTest: TM for online testing • Delta Execution using StealthTest • In vivo Testing using StealthTest (Optionally) • Conclusion Wisconsin Multifacet Project
Online Patch Validation • Bug fixes can introduce more bugs • Patches must be validated • Online Validation [Nagaraja et al., OSDI 2004] • Increased resource usage • Lockstep execution Output Production Input Testing Diff Wisconsin Multifacet Project
Delta Execution[Tucek et al., ASPLOS 2009] • Online Patch Validation Most patches are small Patched and Un-patched executions similar • Delta Execution • Run together except when they differ Wisconsin Multifacet Project
Delta Execution using fork Patched execution Install D data Testing Production fork Isolate D data Merged execution Compute D data Unpatched execution Time Wisconsin Multifacet Project
Multi-threading and fork ‘Park’ all other threads Patched execution Install D data Testing Production fork Isolate D data Compute D data Unpatched execution Merged execution Time Stop all threads to get a consistent memory snapshot Wisconsin Multifacet Project
fork Poor Performance ~9.8ms for split/~106ms for merge [Tucek et al, ASPLOS 2009] Poor Scalability Web-server response rate reduced by 43% Want an alternate mechanism Wisconsin Multifacet Project
Outline • Motivation • Supervised Memory • TokenTM • StealthTest • Online Software Testing • E.g., Patch Validation • StealthTest: TM for online testing • Delta Execution using StealthTest • In vivo Testing using StealthTest (Optionally) • Conclusion Wisconsin Multifacet Project
Delta Execution using StealthTest Isolate patched execution Introspect patched execution Monitor delta data access Delta Execution StealthTest transaction{…} Version Management Tracks new/old values Conflict Detection Monitor accesses Transactional Memory Execute on child process Page diffing mprotect fork Wisconsin Multifacet Project
StealthTest Interface Isolate patched execution Introspect patched execution Monitor delta data access Delta Execution ST_begin_transaction ST_abort_transaction ST_get_old ST_get_new ST_protect_set ST_protect_clear StealthTest transaction{…} Version Management Tracks new/old values Conflict Detection Monitor accesses Transactional Memory Wisconsin Multifacet Project
Requirements from TM • Strong Atomicity [Martin et al., CAL 2006] Transactions isolated from non-transactions => Test transactions isolated from application code • Flexible Conflict Resolution Can abort transactions if necessary => Abort tests if they block application • Communication from within transactions => Expose result of a test Wisconsin Multifacet Project