590 likes | 728 Views
An Introduction to Packet Switching . Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm. Sir William Preece, Chief of the British Postal System, 1876:
E N D
An Introduction to Packet Switching Nick McKeown Assistant Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm
Sir William Preece, Chief of the British Postal System, 1876: “The Americans may have need of the telephone, but we do not. We have plenty of messenger boys.”
Outline • IntroductionWhat is a packet-switch? • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
IntroductionWhat is a Packet Switch? • IntroductionWhat is a packet-switch? • Basic Architectural Components • Some Example Packet Switches • The Evolution of IP Routers • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
Basic Architectural Components Congestion Control Control Admission Control Reservation Routing Datapath: per-packet processing Output Scheduling Switching Policing
Basic Architectural ComponentsDatapath: per-packet processing 3. 1. Output Scheduling 2. Forwarding Table Interconnect Forwarding Decision Forwarding Table Forwarding Decision Forwarding Table Forwarding Decision
Enterprise WAN access & Enterprise Campus Switch Edge Router Where high performance packet switches are used - Carrier Class Core Router - ATM Switch - Frame Relay Switch The Internet Core
IntroductionWhat is a Packet Switch? • IntroductionWhat is a packet-switch? • Basic Architectural Components • Some Example Packet Switches • The Evolution of IP Routers • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
ATM Switch • Lookup cell VCI/VPI in VC table. • Replace old VCI/VPI with new. • Forward cell to outgoing interface. • Transmit cell onto link.
Ethernet Switch • Lookup frame DA in forwarding table. • If known, forward to correct port. • If unknown, broadcast to all ports. • Learn SA of incoming frame. • Forward frame to outgoing interface. • Transmit frame onto link.
IP Router • Lookup packet DA in forwarding table. • If known, forward to correct port. • If unknown, drop packet. • Decrement TTL, update header Cksum. • Forward packet to outgoing interface. • Transmit packet onto link.
IntroductionWhat is a Packet Switch? • IntroductionWhat is a packet-switch? • Basic Architectural Components • Some Example Packet Switches • The Evolution of IP Routers • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
Buffer Memory CPU CPU DMA DMA DMA Line Interface Line Interface Line Interface Memory MAC MAC MAC First Generation Packet Switches Fixed length “DMA” blocks or cells. Reassembled on egress linecard Shared Backplane Line Interface Fixed length cells or variable length packets
DMA DMA DMA Line Card Line Card Line Card Local Buffer Memory Local Buffer Memory Local Buffer Memory MAC MAC MAC Second Generation Packet Switches Buffer Memory CPU
Third Generation Packet Switches Switched Backplane Line Card CPU Card Line Card Local Buffer Memory Local Buffer Memory Line Interface CPU Memory MAC MAC
Outline • IntroductionWhat is a packet-switch? • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
1+1 = 2 operations per cell time N+N = 2N operations per cell time Shared Memory Two Basic Techniques Input-queued Crossbar
Shared MemoryThe Ideal A D T K I P Z Z Z Numerous work has proven and made possible: • Fairness • Delay Guarantees • Delay Variation Control • Loss Guarantees • Statistical Guarantees A A A A A A A A A Z Z Z A A D A B H X F Z
1 Gb/s 64 Gb/s 8 ns 2 Gb/s 212 ns 2.5 Gb/s 160 Gb/s 3.2 ns 5 Gb/s 84.8 ns 10 Gb/s 640 Gb/s 0.8 ns 20 Gb/s 21.2 ns A ComparisonMemory speeds for 32x32 switch Input-queued Shared-Memory Line Rate Memory BW Access Time Per cell Memory BW Access Time 100 Mb/s 6.4 Gb/s 80 ns 200 Mb/s 2.12 s
Buffer MemoryHow Fast Can I Make a Packet Buffer? 5ns SRAM Buffer Memory 64-byte wide bus 64-byte wide bus Rough Estimate: • 5ns per memory operation. • Two memory operations per packet. • Therefore, maximum 51.2Gb/s. • In practice, closer to 40Gb/s.
Memory Bandwidth (to core) time Buffer MemoryIs It Going to Get Better? Specmarks, Memory size, Gate density time
Batcher Sorter Self-Routing Network 3 7 7 7 7 7 7 000 7 2 5 0 4 6 6 001 5 3 2 5 5 4 5 010 2 5 3 1 6 5 4 011 6 6 1 3 0 3 3 100 0 1 0 4 3 2 2 101 1 0 6 2 1 0 1 4 4 4 6 2 2 0 110 Input Queued Combined Input and Output Queued 111 Multi stage Parallel Packet Switches Progression Shared Memory
Outline • IntroductionWhat is a packet-switch? • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
Memory b/w = 2R Input Queueing Scheduler Data In Data Out configuration
58.6% Input QueueingHead of Line Blocking Delay Load 100%
Input QueuesVirtual Output Queues Delay Load 100% Proof by Lyapunov function
Outline • IntroductionWhat is a packet-switch? • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
The Speedup Problem Find a compromise: 1 < Speedup << N • to get the performance of a shared memory switch • close to the cost of an IQ switch
Some Early Approaches Probabilistic Analyses • assume traffic models (Bernoulli, Markov-modulated, non-uniform loading, “friendly correlated”) • obtain mean throughput and delays, bounds on tails • analyze different fabrics (crossbar, multistage, etc) Numerical Methods • use actual and simulated traffic traces • run different algorithms • set the “speedup dial” at various values
The findings Very tantalizing ... • under different settings • (traffic, loading, algorithm, etc) • and even for varying switch sizes A speedup of between 2 and 5 was sufficient!
1 2 1 2 1 Using Speedup
= ? Combined Input-Output Queued Switch 1 N The Ideal Solution Output Queued Switch 1 N N N
Interesting Result Theorem: For a switch with combined input and output queueing to exactly mimic an output queued switch, for all types of traffic, a speedup of 2-1/N is necessary and sufficient. Joint work with Balaji Prabhakar, Ashish Goel and Shang-tse Chuang.
Outline • IntroductionWhat is a packet-switch? • The Memory Bandwidth Problem • Input-Queued SwitchesReducing memory bandwidth requirements • Combined Input-Output Queued SwitchesMaking input-queued switches useful • Parallel Packet SwitchesFurther reducing memory b/width requirements
Optical Physical Layers……are Going to Make Things “Worse” DWDM: • More l’s per fiber a more ports per switch. • # ports: 16, …, 1000’s. Data rate: • More b/s per la higher capacity. • Data rates: 2.5Gb/s, 10Gb/s, 40Gb/s, 160Gb/s, …
Approach #1: Ping-pong Buffering Buffer Memory 64-byte wide bus 64-byte wide bus Buffer Memory
Approach #1: Ping-pong Buffering Buffer Memory 64-byte wide bus 64-byte wide bus Buffer Memory Memory bandwidth doubled to ~80 Gb/s
Approach #2: Multiple Parallel Buffersaka Banking, Interleaving Buffer Memory Buffer Memory Buffer Memory Buffer Memory
The Fork Join Router Router 1 rate, R rate, R 1 1 2 rate, R rate, R N N k Bufferless
The Fork Join Router • Advantages • kh a memory bandwidth i • kh a lookup/classification rate i • kh a routing/classification table size i • Problems • How to demultiplex prior to lookup/classification? • How does the system perform/behave? • Can we predict/guarantee performance?
A Parallel Packet Switch 1 Output Queued Switch rate, R rate, R 2 1 1 Output Queued Switch rate, R rate, R N N k Output Queued Switch
Parallel Packet SwitchQuestions • Can it be work-conserving? • Can it emulate a single big shared memory switch? • Can it support delay guarantees, strict-priorities, WFQ, …?
Parallel Packet SwitchWork Conservation 1 R/k R/k 2 R/k R/k rate, R rate, R 1 1 R/k R/k k Output Link Constraint Input Link Constraint
5 1 1 4 3 2 1 Parallel Packet SwitchWork Conservation 1 5 4 1 R/k R/k 4 1 2 2 R/k R/k 2 rate, R rate, R 1 1 3 R/k R/k k 3 Output Link Constraint
Parallel Packet SwitchWork Conservation 1 S(R/k) Output Queued Switch S(R/k) rate, R rate, R S(R/k) S(R/k) 2 1 1 Output Queued Switch rate, R rate, R N N k Output Queued Switch S(R/k) S(R/k)