1 / 48

Opeoluwa (Luwa) Matthews, Meng Zhang, and Daniel J. Sorin

Scalably Verifiable Dynamic Power Management. Opeoluwa (Luwa) Matthews, Meng Zhang, and Daniel J. Sorin. Duke University. E xecutive Summary. Dynamic Power Management (DPM) used to improve power-efficiency at several levels of computing stack

kimball
Download Presentation

Opeoluwa (Luwa) Matthews, Meng Zhang, and Daniel J. Sorin

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Scalably Verifiable Dynamic Power Management Opeoluwa (Luwa) Matthews, MengZhang, and Daniel J. Sorin Duke University HPCA-20 Orlando, FL, February 19, 2014

  2. Executive Summary • Dynamic Power Management (DPM) used to improve power-efficiency at several levels of computing stack • -within multicore chip, across servers in datacenter, etc. • Deploying DPM scheme risky if not fully verified • -difficult to verify scheme for large-scale systems • Our contribution: Fractal DPM • -framework for designing scalably verifiable DPM • -implement Fractal DPM on 2-chip (16-core) system • -experimental evaluation on real system HPCA-20 Orlando, FL, February 19, 2014

  3. Dynamic Power Management • DPM aims to: • -dynamically allocate power to computing resources • (e.g. cores, chips, servers, etc.) • -attain best performance at given power budget • -achieve lowest power consumption for desired performance n cores in CMP DPM Request Power grant deny Request Power … HPCA-20 Orlando, FL, February 19, 2014

  4. Dynamic Power Management • DPM aims to: • -dynamically allocate power to computing resources • (e.g. cores, chips, servers, etc.) • -attain best performance at given power budget • -achieve lowest power consumption for desired performance DPM grant deny Request Power Request Power nmachines in datacenter … … HPCA-20 Orlando, FL, February 19, 2014

  5. Case for Dynamic Power Management • Chips have hit power density ceiling [Hennessy and Patterson Computer Architecture] HPCA-20 Orlando, FL, February 19, 2014

  6. Case for Dynamic Power Management • Datacenters consume increasing amounts of power Reducing cloud electricity consumption by half saves as much as UK consumes map of UK Cloud [hp.com] HPCA-20 Orlando, FL, February 19, 2014

  7. Case for Verifiable DPM • DPM can greatly improve energy efficiency • Unverified DPM could • -overshoot power budget  system damage • -underutilize resources • -deadlock • Want formal verification • - prove correctness for all possible DPM allocations • - guarantee safety of DPM scheme HPCA-20 Orlando, FL, February 19, 2014

  8. Why Scalably Verifiable DPM is Hard • CMPs and datacenters have many computing resources n computing resources (CR) S power states per CR Sn possible DPM states + • Checking Snstates is intractable for typical values of S and n HPCA-20 Orlando, FL, February 19, 2014

  9. Hypothesis and Assumptions Problem: verification of existing DPM protocols is unscalable Hypothesis: We can design DPM such that it is scalably verifiable -key idea: design DPM amenable to inductive verification -change architecture to match verification methodologies Approach: -abstract away details of computing resources -abstract power states – e.g. Medium power -focus on decision policy (not mechanism e.g. DVFS) HPCA-20 Orlando, FL, February 19, 2014

  10. Outline • Background and Motivation • Fractal DPM • Experimental Evaluation • Conclusions HPCA-20 Orlando, FL, February 19, 2014

  11. Our Inductive Approach • Induction key to scalable verification  can prove DPM correct for arbitrary number of computing resources • Base case: small scale system with few CRs is correct • - small enough that it’s easy to verify with existing tools • Inductive step: system behaves the same at every scale  fractal behavior • Prove base case + prove inductive step  DPM scheme is correct for any number of CRs • Approach more general than DPM, borrowed from prior work on coherence protocols [Zhang 2010] HPCA-20 Orlando, FL, February 19, 2014

  12. Attaining Scalable Verification-base case of induction • CRs request power from DPM controller • DPM controller grants or denies each request • Few states  easy to verify that DPM is correct • note: over-simplified base case for now DPM-C Request Power Grant/Deny CR CR HPCA-20 Orlando, FL, February 19, 2014

  13. Attaining Scalable Verification-base case of induction • Base Case • -Refine our base case a little • -Need all types of structures: CR, DPM-C, Root DPM-C Root DPM-C DPM-C CR CR CR HPCA-20 Orlando, FL, February 19, 2014

  14. Attaining Scalable Verification-inductive step • behavior must be fractal DPM-C Grant/Deny Request Power CR CR HPCA-20 Orlando, FL, February 19, 2014

  15. Attaining Scalable Verification-inductive step • can scale system by replacing CR with larger system DPM-C DPM-C • {DPM-C + 2 CRs} “behaves just like” 1 CR • observational equivalence Grant/Deny Request Power Request Power Grant/Deny CR CR CR HPCA-20 Orlando, FL, February 19, 2014

  16. Attaining Scalable Verification-observational equivalence • Inductive Step – Two Observational Equivalences 1) “Looking-down” equivalence check P1 P1 Small System Large System A’ A Observed externally from P1, A and A’ behave same HPCA-20 Orlando, FL, February 19, 2014

  17. Attaining Scalable Verification-observational equivalence • Inductive Step – Two Observational Equivalences 2) “Looking-up” equivalence check Small System P2 P2 B’ B Large System Observed externally from P2, B and B’ behave same • By induction, protocol correct for all scales HPCA-20 Orlando, FL, February 19, 2014

  18. Fractal DPM Design • CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh) • DPM controller state is <Left Child State>:<Right Child State> • Parent DPM controller “sees” child DPM controller in averaged state Avg(H:L) = M M:L M L H:L L H HPCA-20 Orlando, FL, February 19, 2014

  19. Fractal DPM Design • CR can be in 1 of 5 power states: L(ow), LM, M(ed), MH and H(igh) • DPM controller state is <Left Child State>:<Right Child State> • Parent DPM controller “sees” child DPM controller in averaged state Avg(MH:H) = H H:L H L MH:H H MH HPCA-20 Orlando, FL, February 19, 2014

  20. Fractal DPM Design -fractal invariant • Fractal design + inductive proof  invariant must also be fractal • - Invariant must apply at every scale of system • - Not OK to specify, e.g., <75% of all CRs are in Hstate • Our fractal invariant: children of DPM controller not both in H H:L H:H H L H H:H H:MH H H H MH ILLEGAL ILLEGAL HPCA-20 Orlando, FL, February 19, 2014

  21. Translating Fractal Invariant to System-Wide Cap • We must have fractal invariant for fractal design • But most people interested in system-wide invariants • We prove (not shown) that our fractal invariant implies system-wide power cap • Max power for n CRs is: (n-1)MH + H • i.e., (n-1) CRs in state MH and one CR in state H HPCA-20 Orlando, FL, February 19, 2014

  22. Fractal DPM Design -illustration • CR requests MH M:L L Req. MH H:L L H HPCA-20 Orlando, FL, February 19, 2014

  23. Fractal DPM Design -illustration • CR requests MH • Granting request doesn’t change controller’s Avg state • Avg(H:L)=Avg(MH:L)=M • Request Granted, doesn’t violate invariant M:L block • Controller blocks waiting for ack L MH:L Grant MH L H HPCA-20 Orlando, FL, February 19, 2014

  24. Fractal DPM Design -illustration • CR sets its state • CR sends ack to Controller M:L block L ack MH:L L MH HPCA-20 Orlando, FL, February 19, 2014

  25. Fractal DPM Design -illustration • Controller unblocks M:L L H:L L H HPCA-20 Orlando, FL, February 19, 2014

  26. Fractal DPM Design -illustration • Computing Resource requests H L:L L L:L Req. H L L HPCA-20 Orlando, FL, February 19, 2014

  27. Fractal DPM Design -illustration • CR requests H from its Controller • Controller defers request to its parent • -new request is M (not H) because Avg(H:L)=M L:L Req. M L L:L Req. H L L HPCA-20 Orlando, FL, February 19, 2014

  28. Fractal DPM Design -illustration • Root grants request to Controller, blocks block Grant M M:L L L:L L L HPCA-20 Orlando, FL, February 19, 2014

  29. Fractal DPM Design -illustration • Controller grants request to CR, blocks block Grant M M:L block L H:L Grant H L L HPCA-20 Orlando, FL, February 19, 2014

  30. Fractal DPM Design -illustration • ackspercolate up tree from CR block M:L block L H:L ack L H HPCA-20 Orlando, FL, February 19, 2014

  31. Fractal DPM Design -illustration • ackspercolate up tree from CR • Controllers unblock upon receiving ack block ack M:L L H:L ack L H HPCA-20 Orlando, FL, February 19, 2014

  32. Fractal DPM Design -illustration • ackspercolate up tree from CR • Controllers unblock upon receiving ack M:L L H:L L H HPCA-20 Orlando, FL, February 19, 2014

  33. Verification Procedure • Use model checker to verify base case • - we use well-known, automated Murphimodel checker • Use same model checker to verify observational equivalences • - use prior aggregation method for equivalence check • (Park, TCAD 2000) HPCA-20 Orlando, FL, February 19, 2014

  34. Outline • Background and Motivation • Fractal DPM • Experimental Evaluation • Conclusions HPCA-20 Orlando, FL, February 19, 2014

  35. Experimental Evaluation -fractal inefficiency: cost of fractal behavior • Our fractal invariant implies system-wide cap > n*MH M:H MH:MH violates fractal invariant M:M H:H MH:MH MH:MH M M H MH H MH MH MH Legal: total power = 4MH Illegal: total power = 4MH overshooting system-wide power cap • Violating fractal invariant • Situations are few and don’t significantly degrade performance HPCA-20 Orlando, FL, February 19, 2014

  36. Experimental Evaluation -system model • Implemented Fractal DPM on 16-core linux system, 2 sockets • -2 cores act as a CR • -controllers communicate through UDP across sockets HPCA-20 Orlando, FL, February 19, 2014

  37. Experimental Evaluation -experimental setup Power Mode DVFS Mappings • Entire system plugged into power meter (Wattsup?) HPCA-20 Orlando, FL, February 19, 2014

  38. Experimental Evaluation -comparison schemes • Static Scheme: • - no DPM, set all CRs to the same power state (e.g. MH) • - trivially correct, poor energy efficiency • Oracle DPM: • - allocates for optimal energy efficiency (ED2) under budget • - oracle doesn’t scale, unimplementable • Optimized Fractal DPM (OptFractal): • - CRs re-request lower power state when denied • - no change to Fractal DPM decision algorithm HPCA-20 Orlando, FL, February 19, 2014

  39. Experimental Evaluation • Benchmarks: Details in the paper. HPCA-20 Orlando, FL, February 19, 2014

  40. Results- compared to static scheme • OptFractalDPM within 2% of Oracle DPM ED2savings • FractalDPM within 8% of Oracle DPM ED2 savings HPCA-20 Orlando, FL, February 19, 2014

  41. Results- response latency • Most power requests serviced within 1ms. • - UDP packet round trip ~0.6ms HPCA-20 Orlando, FL, February 19, 2014

  42. Conclusions • We show how a scalably verifiable DPM can be built • Fractal behavior enables one-time verification for all scales • Entire verification is done completely automated in model checker • Fractal DPM achieves energy-efficiency close to optimal allocator HPCA-20 Orlando, FL, February 19, 2014

  43. Scalably Verifiable Dynamic Power Management Opeoluwa (Luwa) Matthews, MengZhang, and Daniel J. Sorin Duke University HPCA-20 Orlando, FL, February 19, 2014

  44. Benchmarks • Important: experiments must stress all Fractal DPM power modes • Each CR repeatedly launches bodytrack (from PARSEC benchmark suite), under a range of predetermined duty cycles • Under given duty cycle, CRs request power state that minimizes ED2 • Why rely on duty cycle, not just different benchmarks or phases? • Stressing all Fractal DPM power modes  stressing DVFS states • Without varying duty cycle, optimal ED2always under highest frequency for all benchmarks tried [Dhiman 2008] • Predetermined set of duty cycles for launching bodytrack that directly maps to set of power modes (or DVFS state) • Experiment constitutes running sequence of bodytrack jobs, randomly selecting duty cycles from predetermined set HPCA-20 Orlando, FL, February 19, 2014

  45. Results • Millions of time steps simulated • For each time step, system perf = % CDF % system perf loss = () * 100% % system performance loss HPCA-20 Orlando, FL, February 19, 2014

  46. Results • Millions of time steps simulated • For each time step, system perf = On 72.6% of time steps Fractal DPM ≡ Oracle DPM % CDF % system perf loss = () * 100% % system performance loss HPCA-20 Orlando, FL, February 19, 2014

  47. Results • Millions of time steps simulated • For each time step, system perf = On 99.9% of time steps Fractal DPM < 20% off from Oracle % CDF % system perf loss = () * 100% % system performance loss HPCA-20 Orlando, FL, February 19, 2014

  48. Results • Millions of time steps simulated • For each time step, system perf = Worst case, Fractal DPM < 36.4% off from Oracle % CDF % system perf loss = () * 100% % system performance loss HPCA-20 Orlando, FL, February 19, 2014

More Related