240 likes | 339 Views
Early output logic and Anti-Tokens. Charlie Brej APT Group Manchester University. Overview. Synchronous Problems Asynchronous Logic Why? How? Solutions Early Output Anti-Tokens. Problems: Communication. Communication horizon
E N D
Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University
Overview • Synchronous Problems • Asynchronous Logic • Why? • How? • Solutions • Early Output • Anti-Tokens
Problems: Communication • Communication horizon • “For a 60 nanometer process a signal can reach only 5% of the die’s length in a clock cycle” [D. Matzke,1997] • Clock distributed using wave pipelining
Problems: Performance Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Cycle time Worst – Average case performance Real Computation
Clock! What is it good for? • No arguing with the clock • 9am - 5pm. No excuses!
Bundled-Data • When you finish, do the next task • Flexitime Request + Delay Acknowledge
How do you know when you are finished? • Synchronous: • Estimate • Global timing reference • Asynchronous (bundled-data) • Estimate • Local delay elements • Asynchronous (delay-insensitive) • When the data arrives • Intrinsic
Becoming Delay Insensitive • Dual-Rail • Two wires • 00 – NULL • 01 – Zero • 10 – One • (11 – Not used) • Four Phase handshake • Return to zero R0 R1 Ack
Dual-Rail interfaces Output generated as early as possible Two Early output cases If either input is ‘0’ then the output is ‘0’ Early Output Logic
Bit level pipelining • Forward completed parts of the result • Pace work • Don’t stall parts unless you have to
Bit level pipelining • Forward completed parts of the result • Pace work • Don’t stall parts unless you have to
Bit level pipelining • Forward completed parts of the result • Pace work • Don’t stall parts unless you have to
Validity • Unnecessary late inputs • Must be acknowledged • Must wait until they arrive • Validity signal • Latch generated • Ready to be acknowledged • Result before all inputs present • Acknowledge after all inputs present
Synchronisation Hurts • No need to wait before generating result • Need to wait for input in order to acknowledge it • Unnecessary stall
Anti-Tokens • Unnecessary late inputs • Stall the entire stage • Proactive approach • Send a ‘cancel’ signal backward to the source • Acknowledge before data arrives • Anti-Token latches • Assert validity early
Anti-token generation 0 1 C
Anti-token generation 0 1 A C
Anti-token Propagation 1 A C
Anti-token Propagation 1 A A C
Anti-token Token collisions A A 1 1 A A 1 1 ? 1 A ?
Anti-token Token collisions 1 A A 1 1 A A 1 1 1 1 1
Remove Unnecessary computation Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Worst – Average case performance Unnecessary Computation/Delays Real Computation Cycle time
Summary • Asynchronous • Delay Insensitive • Safe • No timing assumptions • Average case performance • Remove unnecessary computation • Anti-tokens without mutual exclusion units