1 / 8

The Alpha 21364 Network Architecture

Shubhendu S. Mukherjee , Peter Bannon , Steven Lang, Aaron Spink, and David Webb Alpha Development Group, Compaq HOT Interconnects 9 (2001) Presented by John Ingalls ECE 259 - March 22, 2010. The Alpha 21364 Network Architecture.

jamese
Download Presentation

The Alpha 21364 Network Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Shubhendu S. Mukherjee, Peter Bannon, Steven Lang, Aaron Spink, and David Webb Alpha Development Group, Compaq HOT Interconnects 9 (2001) Presented by John Ingalls ECE 259 - March 22, 2010 The Alpha 21364Network Architecture

  2. Alpha 21364 is a 21264 core plus 1.75MB L2 on-die cache, 2-channel Rambus DRAM, I/O controller, and router at 1.2GHz on 180nm. • Up to 128 processors in a system. All can access others’ memory and I/O. • Directory cache coherence protocol. • 2-D Torus interconnection network with adaptive routing and deadlock-free fallback. • Request packets generally 3 flits in size, data packets generally 18 flits in size. Flits have ECC. Summary of Features

  3. Network is 2-D Torus. • Virtual Cut-Through Routing: Blocked packet’s flits will accumulate in buffer. • Adaptive Routing: Minimum rectangle. Source picks either dimension to send on, algorithm then prefers to keep packets on that dimension. Fig. 3: (pg. 2) Notable Features: Network Routing

  4. Avoiding Coherence Deadlock: Separate virtual channels for responses and requests. • Preserving I/O Consistency: Same class must be in same virtual channel, thus same route, thus retain order in that class (i.e. read or write). • 3 Virtual Channels per Dimension per Class: Adaptive, VCO, and VC1. • Adaptive for bulk of traffic, VC0 and VC1 are fixed-route deadlock-free “drain” for blocked adaptive packets. Notable Features: Deadlock Avoidance

  5. Fig. 5: (pg. 3) • VC0 and VC1 mapped at boot time to prohibit cyclic dependency. Packets on VC0/1 can only turn if they are at corner of minimum rectangle; Adaptive virtual channel has no such restriction. Packets can return to adaptive if non-congested. Notable Features: Deadlock Avoidance

  6. 13 cycles pin-to-pin, any input to any output. • Pipeline clocked at 1.2GHz, links at 800MHz. • Link clock sent with outgoing packet. • ECC recomputed at every hop. 1-bit recoverable. • Arbitration: Input “local” arbiters show a packet that is ready and not blocked from buffer to “global” arbiters for possible dispatch. Output global arbiters select from input local arbiters. • Least-Recently-Selected selection policy. Also, Rotary Rule prioritizes older packets from network. Coherence Dependence Priority rule. Technical Details: Router Architecture

  7. Good Bad • This was built and shipped (albeit late), which immediately lends it credibility. • Simple introduction to interconnection networks: 5 pages makes the authors explain everything clearly and concisely. • No evaluation of performance. • No comparison against competitors (“ours is better” would help sales). • Configuration around faulty routers is mentioned but never explained. • 5 pages isn’t enough to explain the edge cases.

  8. Keywords: 2-D torus. Adaptive routing with deadlock-free fixed-route virtual channels to prevent network deadlock. Separate virtual channels for requests and responses to prevent coherence deadlock. • This was the last major iteration of the Alpha architecture. Why? What competing product replaced it? How was that competitor better? How could the 21364 have been improved to stay competitive (features, performance)? Conclusion / Further Questions

More Related