120 likes | 129 Views
Explore lookahead pipeline styles with early evaluation and completion detection strategies for enhanced performance. Learn decoupled control in high-capacity pipelines to boost concurrency and cycle efficiency in complex stages.
E N D
Recap: Lectures 7 & 8Lookahead Pipeline Styles 2 Strategies: Early Evaluation Early Done
Lookahead Pipelines: Strategy #1 Use non-neighbor communication: • stage receives information from multiple later stages • allows “early evaluation” Benefit: stage gets head-start on next cycle
Lookahead Pipelines: Strategy #2 Use early completion detection: • completion detector moved before stage (not after) • stage indicates“early done”in parallel with computation early completion detector Benefit: again, stage gets head-start on next cycle
Single-Rail Styles matched delay request done request/done indicate valid data bit 1 bit 1 bit n bit m delay delay delay Adapt dual-rail styles to single-rail: • replace dual-rail function blocks by single-rail blocks • replace completion detectors by matched delays Example: LPsr2/2
Lecture 9 Timing Analysis High-Capacity Pipelines
High-Capacity Pipeline: LPHC stage controller pc eval ack delay delay delay Key Idea: Decouple control for pull-up and pull-down • increases pipeline concurrency initiates next cycle early • once N+1 evaluates, can enter “isolate (hold) phase” • stage N allowed to complete entire next cycle! N N+1 N+2
Inside an LPHC stage Decoupled control: pull-up and pull-down stacks are independently controllable: eval pc “keeper” precharge control Pull-down stack datainputs dataoutputs evaluation control • pcasserted: precharge • evalasserted: evaluate • both de-asserted: enter“isolate” (hold) phase
Cycle of an LPHC Stage Eval Eval pc=1eval=1 Isolate Isolate pc=1eval=0 Precharge pc=0eval=0 Precharge • Only a singlebackward synchronization arc: • once stage N+1 has completed Eval, N can perform entire next cycle! • why safe?: N+1 enters isolate phase … key to greater concurrency • almost all existing approaches: require 2 arcs • One (natural) forward synchronization arc: • stage N+1 evaluates new data only after N has evaluated Stage N Stage N+1
Formal Specification of Controller (Start evaluate) pc+ eval+ (Evaluate of N+1 complete) T+ (Evaluate complete) S+ eval- (Isolate) (Start precharge) pc- (Precharge of N+1 complete) T- (Precharge complete) S- Problem: Specification too concurrent for direct synthesis • desired precharge condition: N and N+1 have evaluated same data • problem: this condition not uniquely captured by given signals! • N may evaluate next data item,while N+1 stuck on current item!
Modified Specification of Controller pc+ eval+ (Evaluate of N+1 complete) T+ S+ eval- T- (Precharge of N+1 complete) pc- ok2pc+ S- ok2pc- Solution: Add a state variable ok2pc ok2pc records whether N+1 has “absorbed” N’s data item • ok2pc resets immediately when N deletes item (N precharges) • ok2pc is set when N+1 deletes item (N+1 precharges)
Controller implementation T Controller implementation is very simple: • each signal implemented using a single gate • ok2pc typically off the critical path S pc T NAND3 S aC + ok2pc eval S INV
Performance 2 2 3 N isolates 1 Cycle Time = N N+1 N+2 N enables itself for next evaluation N precharges N evaluates N+1 evaluates