90 likes | 107 Views
Explore the complexities of high-speed signal propagation, interconnect modeling, power distribution networks, and thermal management in advanced systems. Learn key concepts from industry-level examples such as Blue Gene/L and z196. Gain insights into clock distribution, extraction, and simulation techniques.
E N D
Course Information • Instructor • CK Cheng, ckcheng+291@ucsd.edu, 858 534-6184 • Schedule • Lectures: 5:00-6:20PM, TTH, CSE 2217 • Textbooks • (H) High Speed Signal Propagation: Advanced Black Magic Howard Johnson and Martin Graham • (D) Digital Systems Engineering William J. Dally, John W. Poulton • Content • 1. Structure of Interconnect and Packaging • 2. Electrical and Physical Scaling • 3. Interconnect Modeling: Wire and Transmission Line Models • 4. Interconnect Signaling • 5. Transmitters and Receivers • 6. Power Distribution Network • 7. Clock Distribution • 8. Extraction and Simulation • 9. Thermal Issues
Overall View 8x8 Racks: 65,536 compute nodes 25KW/Rack 2 Midplanes/Rack 16 Node cards/Midplane 16 Compute cards/Node card 2 PUs/Compute card 64x32x32 Torus 1.4Gb/s differential link, 700MHz clock System Example: Blue Gene/L 2005
Compute card 14.3W/ASIC node: power density 10.4W/cm2 206mm x 55mm/compute card 14 layers: 6 signal, 8 power System Example: Blue Gene/L 2005
Air cooling 25KW/ 0.91 x 0.91 m2 ≈ 3W/cm2 Air displacement 1.4m3/s, Average velocity 6.7m/s Fan speed is optimized individually Plenum: θ, β, EMI screen Elliptical vane System Example: Blue Gene/L 2005
Clock Length matching, differential pairs with termination Interconnect Pre-emphasis, On-chip termination Vdd/Vss noise: 185-100 ps delay Midplane: reduce longest path between boards 18 layers 190-215um width trace at 1.0ounce copper for 100ohm differential pairs 100um width trace at 0.5 ounce copper for short wires System Example: Blue Gene/L 2005
z196 (2012): 45nm tech, released 9/2010 96 cores, 5.2GHz, 770GB Memory/node 3KW/PU book, 4PU books/backplane MCM 1MCM/PU book, 2KA/MCM 6PCs, 2 Cache/MCM 96x96mm2, 103 layers, 7,356pins/MCM System Example: z196
Water cooling option humidity and atmospheric pressure -> dew point + 6°C 3.25 gallons/minute for each processor module Lower temperatures -> lower processor power consumption No refrigeration compressors Air conditioning of the room: energy reduced by a factor of 3 Save 4 kW/4PU books System Example: z196
Power Distribution at ±5% tolerance Locate power conversion close to the chip DCA-> 40-48V Gearboxes -> 1.1V, 17 power domains Feedback control Redundancy N+2 (N=2), V, I, T sensing for failure detection Previous version: 600W copper losses, 5 ounces metal plane Now 400W (1/3 on copper, 2/3 on power conversion) Deep trench capacitor: 25 times density, 15uF -> 5.2 GHz on chip System Example: z196
Power network impedance evaluation (10mΩ) Set on-off sequence for clock tree to create stimulus pattern Measure voltage with probes Average 7,864 times, 2M samples for 2ms interval at 1GHz sample rate Z(f)=V(f)/I(f) System Example: z196