150 likes | 182 Views
This study explores the impact of prefetching on fetch performance and memory latency in the context of execution engines and branch predictors. The importance of prefetch filters, buffers, and decoupled branch predictors is analyzed.
E N D
Fetch Directed Prefetching - a Study CS 752 Project Gokul Nadathur Nitin Bahadur Sambavi Muthukrishnan Gokul, Nitin, Sambavi
Motivation • Execution engine limited by fetch bandwidth • effect of memory latency on fetch • correlation between i-cache stalls and branch predictor • rate at which branch predictor and BTB can be cycled • With increase in ILP, there is a need to increase fetch performance Gokul, Nitin, Sambavi
Prefetch Instruction Queue L2 Cache Prefetch Prefetch Filtration Mechanism Prefetch Buffer Branch Predictor Instruction Fetch Fetch Target Buffer Fetch Target Queue Fetch Directed Architecture Gokul, Nitin, Sambavi
Decoupled Branch Predictor • has its own PC • runs independent of fetch pipeline stage • makes a prediction each cycle • unaffected by i-cache stalls • Problem!!! • May not have updated branch history Gokul, Nitin, Sambavi
Fetch Target Buffer and Fetch Target Queue Fetch Target Buffer • Stores fall through and target address for taken branches • Accessed with a prediction from branch predictor each cycle • Fills in single/multiple cache line blocks into FTQ Fetch Target Queue • Contains blocks of instruction addresses to be next executed • FTQ entries are dequeued by fetch engine Gokul, Nitin, Sambavi
Prefetch Filter and Prefetch Instruction Queue Prefetch Instruction Queue • Contains queue of cache blocks to be prefetched • Prefetch mechanism dequeues PIQ and performs the prefetching Prefetch Filter • Takes entries from FTQ, filters them and inserts them into PIQ • Enables intelligent prefetching ! Gokul, Nitin, Sambavi
L1 I-cache L2 I-cache Stream buffer Head Tail FIFO Stream Buffers Gokul, Nitin, Sambavi
Prefetching in the Fetch Directed Architecture • Similar to stream buffers • Addresses given by PIQ Gokul, Nitin, Sambavi
Simulation Results Gokul, Nitin, Sambavi
Simulation Results Gokul, Nitin, Sambavi
Simulation Results Gokul, Nitin, Sambavi
Simulation Results Gokul, Nitin, Sambavi
Simulation Results Gokul, Nitin, Sambavi
Simulation Results Gokul, Nitin, Sambavi
Conclusions • Prefetching definitely helps • Fetch directed architecture aids prefetching • Optimal results require sophisticated memory hierarchy Gokul, Nitin, Sambavi