200 likes | 348 Views
Concurrent VLSI Architecture Group. Adaptive Backpressure: Efficient Buffer Management for On-Chip Networks. Daniel U. Becker , Nan Jiang, George Michelogiannakis , William J. Dally Stanford University. ICCD 2012, 9/30/12–10/3/12, Montreal, Canada. Overview.
E N D
Concurrent VLSI Architecture Group Adaptive Backpressure:Efficient Buffer Management for On-Chip Networks Daniel U. Becker, Nan Jiang, George Michelogiannakis, William J. Dally Stanford University ICCD 2012, 9/30/12–10/3/12, Montreal, Canada
Overview • Input buffer sharing is attractive in NoCs • Improves area and power efficiency • But facilitates spread of congestion • Adaptive Backpressure mitigates performance degradation by avoiding unproductive use of buffer space in the presence of congestion • Avoid downsides of buffer sharing while maintaining benefits in benign case Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Dynamic Buffer Management • Buffer space is expensive resource in NoCs • 30-35% network power (MIT RAW, UT TRIPS) • Dynamic management increases utilization by sharing buffer space among multiple VCs • Optimize use of expensive buffer resources • Decrease incremental cost of VCs • Improved area and power efficiency • 25% more throughput or 34% less power [Nicopoulos’06] Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Buffer Monopolization • Blocked flits from congested VC accumulate in buffer • Effective buffer size reduced for other VCs • Performance degradation (latency / throughput) • Congestion spreads across VCs (flows / apps / VMs / …) VC 0 VC 1 Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Adaptive Backpressure Goal: • Avoid unproductive use of buffer space • But allow sharing when beneficial Approach: • Match arrival and departure rate for each VC by regulating credit availability (backpressure) • Derive quota from credit round trip times Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Quota Motivation (1) Router 0 Router 1 Router 0 Router 1 Tcrt,0 Idle cycle Credit stall time Without congestion, full throughput requires Tcrt,0 credits Insufficient credit supply causes idle cycle downstream Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Quota Motivation (2) Router 0 Router 1 Router 0 Router 1 Tcrt,0+Tstall Congestion stall Congestion stall Queuing stall Queuing stall Queuing stall Queuing stall Queuing stall Queuing stall Credit stall Excess flits Excess flits Excess drained time Congestion stall causes unproductive buffer occupancy Matching stalls avoids unproductive buffer occupancy Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Quota Heuristic • Track credit RTT for each output VC • RTT=RTTmin⇒ set quota to RTTmin • No downstream congestion • Allow one flit in each cycle of RTT interval • RTT>RTTmin⇒ subtract difference from RTTmin • Each congestion and queuing stall adds to RTT • Allow one credit stall per downstream stall Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Implementation • Network design determines RTTmin for each link • Track RTT for single in-flight credit per VC • Update quota value upon return • Switch allocator masks all VCs that exceed quota • Simple extension to existing flow control logic • No additional signaling required • < 5% overhead for 16x64b buffer with 4 VCs Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Evaluation Methodology • BookSim 2.0 • 8x8 2D mesh, 64-bit channels, DOR • 16-slot input buffers, 4 VCs • Combined VC and switch allocation • Synthetic traffic and application benchmarks • Compare ABP to unrestricted sharing Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Network Stability (1) • For adversarial traffic, throughput in Mesh is unstable at high load • Traffic merging causes starvation • Tree saturation causes widespread congestion • ABP improves stability • Throttles sources that inject at very high rate • Efficient buffer use reduces tree saturation • Faster recovery from transient congestion Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Network Stability (2) [tornado traffic] 6.3x Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Network Stability (3) [foreground traffic at 50% injection rate] 3.3x -13% saturation rate Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Performance Isolation (1) • Inject two classes of traffic into network • Shared buffer space, separate VCs • Sharing causes interference between classes • ABP reduces interference • Contains effects of congestion within a class • Better isolation between workloads, VMs, … Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Performance Isolation (2) [uniform random foreground traffic] -33% -38% [uniform random background traffic] [hotspot background traffic] Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Performance Isolation (3) [50% uniform random background traffic] -31% w/o background Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Application Performance (1) • 8 interleaved memory controllers • Heterogeneous network nodes • Array of stream processors • Streaming data to memory • Modeled as hotspot traffic • In-order general purpose core • Running at 4x network frequency • Executing PARSEC benchmarks • Modeled using Netrace[Hestness’11] • Common network • Disjoint VC ranges • Shared buffer space Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Application Performance (2) -31% w/o background [12.5% injection rate for streaming traffic] Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Conclusions • Sharing improves buffer utilization, but can lead to undesired interference effects • Adaptive Backpressure regulates credit flow to avoid unproductive use of shared buffer space • Mitigates performance degradation in presence of adversarial traffic • But maintains key benefits of buffer sharing under benign conditions Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure
Thank you for your attention! The End Becker, Jiang, Michelogiannakis, Dally: Adaptive Backpressure