High-Bandwidth Packet Switching on the Raw General-Purpose Architecture

High-Bandwidth Packet Switching on the Raw General-Purpose Architecture Gleb Chuvpilo Saman Amarasinghe MIT LCS Computer Architecture Group September 19, 2002

Talk at a Glance • Motivation • Architecture of Internet Routers • Raw Processor Overview • Raw Router Architecture • Switch Fabric Design • Distributed Scheduling Algorithm • Results and Analysis • Future Work and Conclusion

We are on… • Motivation • Architecture of Internet Routers • Raw Processor Overview • Raw Router Architecture • Switch Fabric Design • Distributed Scheduling Algorithm • Results and Analysis • Future Work and Conclusion

Motivation • Build a fast IP router on a general-purpose architecture Why? • Flexibility  new protocols and services • Price  economies of scale

NetworkProcessor ForwardingEngine ForwardingEngine ForwardingEngine ForwardingEngine Interface Interface Interface Interface SwitchFabric Architecture of Internet Routers

Switch Fabric

Click Modular Router

Raw Processor Overview • 16 MIPS-like tiles on a single die • 2 Megabytes of SRAM on-chip • Over a thousand signal I/O pins • Over 200 Gbps of external chip bandwidth • Scalable to thousands of tiles!

Raw Layout

Raw Communication Mechanisms • Two static networks • Two dynamic networks

Raw Static Networks • Destinations known at compile time • Message size known at compile time • Cycle-by-cycle switch schedule • Three-cycle nearest neighbor send-to-use latency • No processing overhead

Static Network: Send

Static Network: Receive

Raw Dynamic Networks • Unpredictable events • External asynchronous interrupts • Cache misses • 15- to 30-cycle nearest neighbor send-to-use latency (message header processing overhead)

Raw is Good for Streaming

2 1 3 4 Given: Four Networks…

… and Sixteen Tiles:

Problem: Mapping? ? StaticInterconnect Dynamic Communication

Solution: Rotating Crossbar Out 0 Out 1 In 0 In 1 In 3 In 2 Out 3 Out 2

Rotating Crossbar Highlights • The idea of a Token Ring network absolute fairness • Algorithm uses two static networks, dynamic networks are idle • All deadlock-free configurations are scheduled at compile time • Four headers and token location define a global configuration • Global configuration is computed in a distributed manner at run time

Rotating Crossbar Illustrated

Phases of the Algorithm TILE PROCESSOR SWITCH PROCESSOR headers_request headers send_prev_config choose_new_config route_body confirm update_token

Configuration Space • Let’s enumerate the number of configurations: SPACE = |Hdr0| x … x |Hdr3| x |Token|, where |Hdr0| = … = |Hdr3| = 5, and |Token| = 4  therefore SPACE = 54 x 4 = 2,500 distinct configurations

So What?... • Each tile has 8,192 words of instruction memory, same for switch   8,192/2,500 = 3.3 instructions per configuration  not enough!  need to use off-chip memory  slow!   need to minimize SPACE

Minimization out cwnext in ccwprev cwprev ccwnext

Clients and Servers of a Crossbar Processor

Outcome of Minimization • We cut down the number of configurations by 78 times! Now there are only 32 entries!   the program can fit in the local instruction memory!

Implementation • Raw Router was tested in a cycle-accurate simulator of the Raw processor • Raw prototype clock speed is assumed to be 250 MHz • The focus of research is on switch fabric, NOT on route lookup, etc.

Peak Throughput

Average Throughput

Future Work • Take advantage of dynamic networks • Implement IP route lookup • Add computation on data (encryption) • Add support of multicast traffic • Implement Quality of Service • Add virtual output queueing • Explore larger router configurations

Conclusion • Implemented a gigabit switch on Raw • Mapped dynamic communication to static interconnect • Can intermix switch fabric with computation • High-bandwidth I/O allows performance of custom ASIC processors

Questions?

High-Bandwidth Packet Switching on the Raw General-Purpose Architecture

High-Bandwidth Packet Switching on the Raw General-Purpose Architecture

Presentation Transcript

Packet Switching

IP Packet Switching

Packet Switching Vs Circuit Switching

IP Packet Switching

Packet Switching vs. Circuit Switching

Optical Burst/Packet Switching Networks ( Based on the Application Packet Switching in Future

Packet Switching

Packet Switching

Packet Switching (basics)

Packet Switching

Packet switching versus circuit switching

Quantum Packet Switching

Packet Switching Networks

Packet Switching

Packet Switching

Packet Switching

OPTICAL PACKET SWITCHING

Packet Switching

Packet Switching

Packet Switching (basics)