1 / 15

Fetch-Directed Prefetching: A Study CS 752 Project

This study explores the impact of prefetching on fetch performance and memory latency in the context of execution engines and branch predictors. The importance of prefetch filters, buffers, and decoupled branch predictors is analyzed.

montgomeryl
Download Presentation

Fetch-Directed Prefetching: A Study CS 752 Project

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fetch Directed Prefetching - a Study CS 752 Project Gokul Nadathur Nitin Bahadur Sambavi Muthukrishnan Gokul, Nitin, Sambavi

  2. Motivation • Execution engine limited by fetch bandwidth • effect of memory latency on fetch • correlation between i-cache stalls and branch predictor • rate at which branch predictor and BTB can be cycled • With increase in ILP, there is a need to increase fetch performance Gokul, Nitin, Sambavi

  3. Prefetch Instruction Queue L2 Cache Prefetch Prefetch Filtration Mechanism Prefetch Buffer Branch Predictor Instruction Fetch Fetch Target Buffer Fetch Target Queue Fetch Directed Architecture Gokul, Nitin, Sambavi

  4. Decoupled Branch Predictor • has its own PC • runs independent of fetch pipeline stage • makes a prediction each cycle • unaffected by i-cache stalls • Problem!!! • May not have updated branch history Gokul, Nitin, Sambavi

  5. Fetch Target Buffer and Fetch Target Queue Fetch Target Buffer • Stores fall through and target address for taken branches • Accessed with a prediction from branch predictor each cycle • Fills in single/multiple cache line blocks into FTQ Fetch Target Queue • Contains blocks of instruction addresses to be next executed • FTQ entries are dequeued by fetch engine Gokul, Nitin, Sambavi

  6. Prefetch Filter and Prefetch Instruction Queue Prefetch Instruction Queue • Contains queue of cache blocks to be prefetched • Prefetch mechanism dequeues PIQ and performs the prefetching Prefetch Filter • Takes entries from FTQ, filters them and inserts them into PIQ • Enables intelligent prefetching ! Gokul, Nitin, Sambavi

  7. L1 I-cache L2 I-cache Stream buffer Head Tail FIFO Stream Buffers Gokul, Nitin, Sambavi

  8. Prefetching in the Fetch Directed Architecture • Similar to stream buffers • Addresses given by PIQ Gokul, Nitin, Sambavi

  9. Simulation Results Gokul, Nitin, Sambavi

  10. Simulation Results Gokul, Nitin, Sambavi

  11. Simulation Results Gokul, Nitin, Sambavi

  12. Simulation Results Gokul, Nitin, Sambavi

  13. Simulation Results Gokul, Nitin, Sambavi

  14. Simulation Results Gokul, Nitin, Sambavi

  15. Conclusions • Prefetching definitely helps • Fetch directed architecture aids prefetching • Optimal results require sophisticated memory hierarchy Gokul, Nitin, Sambavi

More Related