1 / 27

Prefetch-Aware Shared-Resource Management for Multi-Core Systems

This research paper explores the impact of prefetching on shared resource management techniques in multi-core systems. It proposes fair cache management, memory controllers, on-chip interconnect management, and management of multiple shared resources. The goal is to devise general mechanisms for incorporating prefetch requests into fairness techniques.

mbruns
Download Presentation

Prefetch-Aware Shared-Resource Management for Multi-Core Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prefetch-Aware Shared-Resource Managementfor Multi-Core Systems Eiman Ebrahimi* Chang JooLee*+ OnurMutlu‡ Yale N. Patt* * HPS Research Group The University of Texas at Austin ‡ Computer Architecture Laboratory Carnegie Mellon University + Intel Corporation Austin

  2. Background and Problem Core 0 Core 1 Core 2 Core N ... Shared Memory Resources Shared Cache Core 0 Prefetcher Core N Prefetcher ... ... Memory Controller On-chip Chip Boundary Off-chip DRAM Bank 0 DRAM Bank 1 DRAM Bank 2 DRAMBank K ... 2

  3. Background and Problem • Understand the impact of prefetching on previously proposed shared resource management techniques

  4. Background and Problem • Understand the impact of prefetching on previously proposed shared resource management techniques • Fair cache management techniques • Fair memory controllers • Fair management of on-chip inteconnect • Fair management of multiple shared resources

  5. Background and Problem • Understand the impact of prefetching on previously proposed shared resource management techniques • Fair cache management techniques • Fair memory controllers • Network Fair Queuing (Nesbit et. al. MICRO’06) • Parallelism Aware Batch Scheduling (Mutlu et. al. ISCA’08) • Fair management of on-chip interconnect • Fair management of multiple shared resources • Fairness via Source Throttling (Ebrahimi et. al., ASPLOS’10)

  6. Background and Problem • Fair memory scheduling technique: Network Fair Queuing (NFQ) • Improves fairness and performance with no prefetching • Significant degradation of performance and fairness in the presence of prefetching Aggressive Stream Prefetching No Prefetching

  7. Background and Problem • Understanding the impact of prefetching on previously proposed shared resource management techniques • Fair cache management techniques • Fair memory controllers • Fair management of on-chip inteconnect • Fair management of multiple shared resources • Goal: Devise general mechanisms for taking into account prefetch requests in fairness techniques

  8. Background and Problem • Prior work addresses inter-application interference caused by prefetches • Hierarchical Prefetcher Aggressiveness Control (Ebrahimi et. al., MICRO’09) • Dynamically detects interference caused by prefetches and throttles down overly aggressive prefetchers • Even with controlled prefetching,fairness techniques should be made prefetch-aware

  9. Outline • Problem Statement • Motivation for Special Treatment of Prefetches • Prefetch-Aware Shared Resource Management • Evaluation • Conclusion

  10. Parallelism-Aware Batch Scheduling (PAR-BS) [Mutlu & Moscibroda ISCA’08] • Principle 1: Parallelism-awareness • Schedules requests from each thread to different banks back to back • Preserves each thread’s bank parallelism • Principle 2: Request Batching • Marksa fixed number of oldest requests from each thread to form a “batch” • Eliminates starvation & provides fairness T1 T1 T2 T0 T3 T2 Batch T3 T2 T0 T3 T2 T1 T1 T0 Bank 0 Bank 1

  11. Impact of Prefetching onParallelism-Aware Batch Scheduling • Policy (a): Include prefetches and demands alike when generating a batch • Policy (b): Prefetches are not included alongside demands when generating a batch

  12. Impact of Prefetching onParallelism-Aware Batch Scheduling Policy (a) Mark Prefetches in PAR-BS Accurate Prefetch Inaccurate Prefetch DRAM P2 Service Order P2 Bank 1 D2 P1 D1 D2 P2 D2 D2 Bank 2 P1 P1 D2 D2 P2 D1 P1 Stall Core 1 Compute P1 Batch P1 Stall Core 2 Compute C C Bank 1 Bank 2 Saved Cycles Hit P2 Hit P2 Policy (b) Don’t Mark Prefetches in PAR-BS Saved Cycles P2 P2 Service Order P2 P2 Bank 1 P1 D2 D1 D2 P1 P2 P1 P1 D2 D2 Bank 2 D2 D2 P1 P1 P2 D1 P1 D2 D2 Accurate Prefetches Too Late Stall Core 1 Compute P1 D1 D2 P1 Batch Stall Stall Stall Core 2 Compute C C Bank 1 Bank 2 Miss Miss

  13. Impact of Prefetching on Parallelism-Aware Batch Scheduling • Policy (a): Include prefetches and demands alike when generating a batch • Pros: Accurate prefetches will be more timely • Cons: Inaccurate prefetches from one thread can unfairly delay demands and accurate prefetches of others • Policy (b): Prefetches are not included alongside demands when generating a batch • Pros: Inaccurate prefetches can not unfairly delay demands of other cores • Cons: Accurate prefetches will be less timely • Less performance benefit from prefetching

  14. Outline • Problem Statement • Motivation for Special Treatment of Prefetches • Prefetch-Aware Shared Resource Management • Evaluation • Conclusion

  15. Prefetch-Aware Shared Resource Management • Three key ideas: • Fair memory controllers: Extend underlying prioritization policies to distinguish between prefetches based on prefetch accuracy • Fairness via source-throttling technique:Coordinate core and prefetcher throttling decisions • Demand boosting for memory non-intensive applications

  16. Prefetch-Aware Shared Resource Management • Three key ideas: • Fair memory controllers: Extend underlying prioritization policies to distinguish between prefetches based on prefetch accuracy • Fairness via source-throttling technique:Coordinate core and prefetcher throttling decisions • Demand boosting for memory non-intensive applications

  17. Prefetch-aware PARBS (P-PARBS) Policy (a) Mark Prefetches in PAR-BS Accurate Prefetch Inaccurate Prefetch DRAM P2 Service Order P2 Bank 1 D2 P1 D1 D2 P2 D2 D2 Bank 2 P1 P1 D2 D2 P2 D1 P1 Stall Core 1 Compute P1 Batch P1 Stall Core 2 Compute C C Bank 1 Bank 2 Hit P2 Hit P2

  18. Prefetch-aware PARBS (P-PARBS) Policy (b) Don’t Mark Prefetches in PAR-BS Accurate Prefetch Inaccurate Prefetch DRAM P2 Service Order Bank 1 P2 P1 D1 D2 P1 P2 P1 P1 Bank 2 D2 D2 P1 P1 P2 Accurate Prefetches Too Late D2 D2 Stall Core 1 Compute D1 D2 Batch Underlying prioritization policies need to distinguish between prefetches based on accuracy Stall Stall Stall Core 2 Compute C C Bank 1 Bank 2 Miss Our Policy: Mark Accurate Prefetches Miss P1 Service Order P1 P1 Bank 1 D1 D2 P2 P1 P2 P2 Bank 2 D2 D2 P2 P1 P1 D2 D2 Stall Core 1 Saved Cycles Compute D1 D2 Batch Stall Core 2 Compute C C Bank 1 Bank 2 Hit P2 Hit P2

  19. Prefetch-Aware Shared Resource Management • Three key ideas: • Fair memory controllers: Extend underlying prioritization policies to distinguish between prefetches based on prefetch accuracy • Fairness via source-throttling technique:Coordinate core and prefetcher throttling decisions • Demand boosting for memory non-intensive applications

  20. No Demand Boosting With Demand Boosting Legend: Legend: Core1 Dem Core1 Dem Core 1 is memory non-intensive Core 1 is memory non-intensive Serviced Last Core2 Dem Core2 Dem Core2 Pref Core2 Pref Service Order Demand boosting eliminates starvation of memory non-intensive applications Core 2 is memory intensive Core 2 is memory intensive Serviced First Bank 1 Bank 1 Bank 2 Bank 2

  21. Prefetch-Aware Shared Resource Management • Three key ideas: • Fair memory controllers: Extend underlying prioritization policies to distinguish between prefetches based on prefetch accuracy • Fairness via source-throttlingtechnique:Coordinate core and prefetcher throttling decisions • Demand boosting for memory non-intensive applications

  22. Outline • Problem Statement • Motivation for Special Treatment of Prefetches • Prefetch-Aware Shared Resource Management • Evaluation • Conclusion

  23. Evaluation Methodology • x86 cycle accurate simulator • Baseline processor configuration • Per-core • 4-wide issue, out-of-order, 256 entry ROB • Shared (4-core system) • 128 MSHRs • 2MB, 16-way L2 cache • Main Memory • DDR3 1333 MHz • Latency of 15ns per command (tRP, tRCD, CL) • 8B wide core to memory bus

  24. System Performance Results 11% 10.9% 11.3%

  25. Max Slowdown Results 9.9% 14.5% 18.4%

  26. Conclusion • State-of-the-art fair shared resource management techniques can be harmful in the presence of prefetching • Their underlying prioritization techniques need to be extended to differentiate prefetches based on accuracy • Core and prefetcher throttling should be coordinated with source-based resource management techniques • Demand boosting eliminates starvation ofmemory non-intensive applications • Our mechanisms improve both fair memory schedulers and source throttling in both systemperformance and fairness by >10%

  27. Prefetch-Aware Shared-Resource Managementfor Multi-Core Systems Eiman Ebrahimi* Chang JooLee*+ OnurMutlu‡ Yale N. Patt* * HPS Research Group The University of Texas at Austin ‡ Computer Architecture Laboratory Carnegie Mellon University + Intel Corporation Austin

More Related