510 likes | 639 Views
Packet Switching. Instructor: Rob Nash Readings: Chapter 3, P&D. To (Inter)Connect Two Nets. We have a limited number of hosts so far Also, a limited geographical distance As broadcast can only take us so far We can connect two distant nodes (or networks) via point-to-point connections
E N D
Packet Switching • Instructor: Rob Nash • Readings: Chapter 3, P&D
To (Inter)Connect Two Nets • We have a limited number of hosts so far • Also, a limited geographical distance • As broadcast can only take us so far • We can connect two distant nodes (or networks) via point-to-point connections • But we don’t service any nodes in between • We’d like to build a global network, so we must consider hosts that aren’t directly connected.
Motivation • “Nature seems […] to reach many of her ends by long circuitous routes.” – Rudolph Lotze • “Packets are able to reach many different ends by (sometimes) long circuitous routes” • But imagine this dilemma for a second: • How are packets able to navigate an unknown topology? • Ether is simple: send to everybody, but again doesn’t scale
Borrowing from Telephony • Your phone isn’t directly connected to all other phone users • Rather, you’re connected to a switch • An operator will provide the “directly connected” illusion by configuring a (temporary) link for use in the call • In the same vein, computer networks have packet switches • For use in forwarding/switching packets • Routing is the process of building a forwarding table (4)
Switch Categorizations • Very broadly defined here as either: • Connection-oriented: Like a telephone call, with temporary state stored at each switch • X.25 • ATM • Connectionless: Like the postal service, with even less recourse for problems (no RTS, etc.) • IP, UDP • Also, we’ll focus on two specific examples of switching • Ethernet & ATM
A Bit on Terminology • Forwarding is a table lookup • Given the input port and ID, what is the output port and outgoing ID? • Routing is the algorithm that builds the table • A distributed algorithm by nature of the domain • Should be fair • Consider offering a QoS • This has evolved over the history of networks • LAN Switching is an evolution of Ethernet Bridging with performance augmentations
T3 T3 Switch T3 T3 STS-1 STS-1 Input Output ports ports Switched Networks • Switch Function: • Connects two or more network segments • Forwards packets from input port to output port • Selects a port based on address in packet header CSS432: Switching and Fowarding
Switched Network Advantages • Covers a large geographic area (> 2500m in Ethernet) • Support large numbers of hosts (>1024 hosts in Ethernet) • Maintain performance (>two packets through a switch) • And for n input ports each with buffer b, we can provide n x b queuing simultaneously • Contrast this to Ethernet, where two hosts will compete for the line
Topologies • Point-to-Point • Ethernet MAC • Rings • A switch adds the star topology to our set • Also, the ability to interconnect any of the above networking technologies • As switches may be connected to hosts, or other switches
Switching in General • Switched networks are more scalable than shared-media networks • Directly due to their ability to support many hosts at full speed (limited to memory capacity) • And, we can use a switch to combine two disparate networks • A SONET STS-3 link with and a few T3s • Each port runs the appropriate link layer protocol • Switching (or forwarding): receiving incoming packets on an input port and selecting the appropriate output port on which to forward the data
Questions on this Approach? • How does a switch make its decision? • This depends on the approach {connectionless, etc} • In general, look at the header of the packet for an identifier (could be a local id, could be an IP addr) • Use this to make your decision by looking up the ID in a table, and forward accordingly • We’ll start simply with the datagram approach
Identifiers • We can provide unique identifiers to each host on the network (e.g., an address) • We also will be interested in providing identifiers to label each input and output port in a switch
Datagrams • Each packet contains enough information to enable any switch to forward it • How? Just including the complete destination address in every packet. • Each switch will use the destination address as the key in the lookup • No connection state (thus no setup) • All packets are forwarded independently • Node failure and reroute is possible
Host D Host E 0 Switch 1 Host F 3 1 Switch 2 2 Host C 2 3 1 0 Host A 0 Switch 3 Host B Host G 1 3 2 Host H Datagram Switching (P.4) Table at Switch 2 CSS432: Switching and Fowarding
Forwarding & Routing • In a simple and static environment, one network operator may know the topology • And, manually install this in switches in the network • In a distributed and dynamic environment, no one operator knows the complete topology • Multiple pathways, failing nodes, etc. • This harder problem is routing (Section 4.2) • For now: routing is an assumed background process, and forwarding is a simple lookup
Connectionless Minimalist • Hosts can send packets at any time (and to anywhere) • No setup or teardown • All switches can immediately forward this packet, assuming a correct routing table • Hosts don’t know (or care) about the health of the intermediary network or destination node • You could send a packet to a machine that just lost power • Or, you could send a packet through a network whose switches just lost power • Failures may not catastrophically effect communications if alternate routes exists around failed nodes (and the network updates its tables)
Virtual Circuit Switching • A connection-oriented approach • With a setup, communicate, and teardown phase • This may seem like TCP over IP, but we’ll see this is implemented on top of the connectionless approach • Setup: establishing connection state and path through the network • Each subsequent packet will follow this path • Forwarding tables use VCIs – Virtual Circuit Identifiers – that help uniquely identify connections at a local switch
0 Switch 1 3 1 2 Switch 2 2 3 1 5 11 0 VCI = 11 Host A VCI = 5 7 VCI = 7 0 Switch 3 1 3 VCI = 4 4 Host B 2 Virtual Circuit Switching (1,4) • Each switch maintains a VC table • The Input Port & VCI uniquely determine a connection Switch 1 Switch 2 Switch 3 CSS432: Switching and Fowarding
Configuring VCs • PVCs – “permanent Virtual Circuits”, which are long-lived (or network operator configured) table entries • Signaled: a host may set up or delete a VC dynamically and autonomously
A Note on Setup • Oracle: How do switches to know what outgoing VCI they should use? • This data is literally downstream of the current switch! • Answer: We fill this data in “in reverse”, after we’ve built a path from A to B. • Then, a setup/connection packet from B to A is sent informing each upstream hop of the VCI it should use • We signal to set up (reserve a VCI entry) and signal to reclaim these resources when done
Implications for VC Switching • At least one RTT delay before any payload is communicated… • Why? • Setup packets differ from payload packets • Since setup contains the full GuID for the destination • So, per-packet overhead is reduced relative to the datagram approach • When we do get to send data, much network topology is known in advance • There is a receiver and route to that receiver, and the receiver is ready to accept data
X.25 • Resources are reserved in advance to avoid contention • SWP is used in between node pairs along the circuit • Flow control is used to prevent congestion, and new circuits are declined if not enough resources at a switch
Intro to X.25 • Popular with telephony companies in the 80s • Physical medium : POTS links or ISDN • ISDN integrates speech and data on the same line • Pre-DSL • From Wiki: • “X.25 is today to a large extent replaced by less complex protocols, especially the Internet protocol (IP) “
Comparing the Approaches • We see the datagram approach is minimal and doesn’t reserve resources in advance • But, it also cannot make the same guarantee that X.25 can • We can implement a QoS concept using the connection model, as we set the service level per connection • QoS here: a performance or resource guarantee • My packets shouldn’t be delayed (queued) too long • My packets will always be accommodated at each switch
Virtual Circuits in Action • Frame Relay is used for VPN construction (4.1.8) • ATM is used to link telephony systems across wide areas in a point-to-point configuration
LAN Switches & Ethernet Bridges • Consider a pair of Ethernets you’d like to connect • We could just place a “repeater” terminal that collects all packets on one net and broadcasts them to the other • Shout louder! • This forms an extended LAN • The simplest version does no optimization • Note that a “bridge” here could be a host, but it meets our definition of a switch.
Switching Performance • Consider a shared-media example • Consider the star topology offered by switching • Note that each host has its own dedicated link • In the MAC example, link contention is an issue • In the switching example, I can send as much as the switch can forward (or buffer) on my own link
A B C Port 1 Bridge Port 2 Z X Y Bridges and Extended LANs • Connectingtwo or more LANs • Repeater • L1 – Physical Layer • Limitations: <= 2500m and <= 1024 nodes • Bridge (or LAN switch) • L2 – link layer • No physical limitations • Fowarding frames using MAC address • Static configuration + partial dynamic configuration (SpanningTree Protocol) • Router • L3 – Network Layer • Routing IP packetsusing IP address • Dynamic configuration CSS432: Switching and Fowarding
A B C Port 1 Bridge Port 2 Z X Y Learning Bridges Based on datagram switching • Do not forward when unnecessary • Ex. A frame sent from A to B • Maintain forwarding table Host Port A 1 B 1 C 1 X 2 Y 2 Z 2 • Learn table entries based on source address • Ex. An entry for A is registered upon receiving a frame from A • Ex. When receiving a frame from B, don’t forward to Port 2 • Table is an optimization; need not be complete • Entries are expired after a specific period of time CSS432: Switching and Fowarding
Network Growth • How could a network come to have cycles in it? • Perhaps it’s a multi-site distributed net where no one administrator knows the complete topology • Introduced by accident? • More likely: introduced for redundancy! • However, Learning Bridges can fail if a cycle exists, so we need a strategy to address graph cycles.
Spanning Tree Protocol (Link) • Algorithm deactivates ports to remove cycles • The spanning tree determines which bridges to use, and which bridges should “sit out” • Note that a bridge may forward on some ports, but not others • Formalized in the IEEE 802.1 Specification • Bridges adopt this distributed algorithm (as we’ll see) • Concept: remove edges from your graph until no cycles exist (the tree is a subset of the graph) • Oddity: vertices in this graph are both hosts and switches
Spanning Further • When the network has settled, certain bridges will be designated to forward packets over their IO ports based on their distance to the root (or ID number if a tie) • Other bridges or ports will simply be disabled • Each bridge decides the ports over which it will and will not forward frames
Spanning Algorithm • Elect the smallest ID as the root • Roots always forward over all ports • Each bridge computes the distance between it and the root • Usually a per-hop count • Trades this information with its neighbors, keeping track of “best” paths • Ie, shortest hop count in this context • Bridges that offer the best paths become designated • Finally all bridges determine the root feeder, which is the only bridge that forwards to the root • Chosen so it is closest to the root
A B B3 C B5 D B7 K B2 E F B1 G H B6 B4 I J STP Overview • Each bridge has unique id (e.g., B1, B2, B3) • Select a bridge with smallest id as root • Select a bridge on each LAN closest to root as designated bridge (use id to break ties) 2 hops • Each bridge forwards frames over each LAN if it is a designated bridge 1 hop B5 < B7 1 hop 1 hop root 1 hop 1 hop B4 < B6 CSS432: Switching and Fowarding
A B B3 C B5 D B7 K B2 E F B1 G H B6 B4 I J STPDetails (use p. 191) • Bridges exchange configuration messages (Y, d, X) • Y: the id of root to be • d: #hops from X to Y • X: the sending bridge id • Initially, each bridge believes it is the root • When learn not the root, stop generating configuration messages • in steady state, only the root generates messages • When learn not a designated bridge, stop forwarding configuration messages • in steady state, only designated bridges forward configuration messages • If any bridge does not receive configuration message after a period of time, it starts generating configuration messages claiming to be the root. (1, 1, 5) (1, 1, 2) (1, 0, 1) (1, 0, 1) (1, 0, 1) CSS432: Switching and Fowarding
Bridge Limitations • STP: • It won’t forward frames over alternative paths for the sake of: • Routing around a congested bridge • Routing along a shorter path like one from a node on B to another node on K • Scales linearly, and uses broadcast mechanism • Bridges in general: • Not scalable (“tens of”) • STP • Broadcast (forwarding all broadcast/multicast frames in the current practice) • Homogenous networks only (uses network’s frame header) • Ethernet to Ethernet • Token ring to Token ring • ATM to ATM • Idea: Partition LANS using coloring/tiling to limit the number • Of network segments that will broadcast
Don’t Expect a Single LAN • “It is never safe to design network software under the assumption that it will run over a single Ethernet segment.” • “Bridges happen.” • Drop frames if congested (rare on Ethernet alone) • Frames could be reordered in an extended LAN • Not in a singular Ethernet segment
Switch Implementation & Perf • Many ways to build economy & high-end switches • More advanced fabrics are implemented in high-end (core) switches • The high level concepts overlap, however • One idea: Get a box and a few NICs (DMA) • Not a bad experimentation setup for new protocols • Or cross-protocol examination • Not so hot for performance • Another idea: Custom Hardware • A shared-memory switch • memory with dual ports • Crossbar switch • Switches that attempt to self-route (3.4-3.5, Batcher & Banyan)
Workstation as a Switch (33-34) Workstation • Advantage: flexible because a workstation has a CPU. • Example • 33MHz 32bit I/O bus • 1Gbps for one way from NIC to main memory • 500Mbps for a round trip between NIC and main memory • Enough to support five 100Mbps Ethenet • What if a packet is very small like 64byes • The workstation has 500,000 packets per second (pps). • Throughput: 500,000 x 64 x 8 = 256Mbps CPU LAN A NIC LAN B I/O ctlr NIC LAN C NIC I/O Bus Main memory CSS432: Switching and Fowarding
Shared Bus/Memory-Based Switch Control processor • A simple design • Shared bus or memory becomes a bottleneck. (Max. 16 bus masters) DMA from port to port Input Port OutputPort InputPort OutputPort Shared bus InputPort OutputPort Shared memory CSS432: Switching and Fowarding
Crossbar Switch • Without a collision, all inputs delivered to each output • All inputs may go to the same output which causes a collision in the output buffer. CSS432: Switching and Fowarding
ATM is Cell Switching • Connection-oriented packet switching • Uses signaling (Protocol Q.2931) • WAN, but more recently LANs • Runs on various physical mediums • SONET • Shared Media such as Wireless • Shared-Media like Ethernet (with LANE) • Packets are called cells, which are fixed length (48 + 5 Bytes)
A Quick Hardware Analogy • LAN packets V.S. ATM cells • Consider also CISC v.s. RISC • In this light, certain features of ATM shine • Observations for a short and simple approach: • Its easier to build HW to do simple (short) jobs • The processing of data is simpler when fixed format • RISC ISA commonly has only a few instruction formats • Off topic: 802.5 & Dec.Intel.Xerox Ethernet standard • Meaning: Compatibility can be simpler with a common format • Simple and short data {frames, instructions} can often be “trained” or “pipelined”
Analogy Further • Observation: homogenous packet length lends to homogenous switching structures • Short and uniform structures can make the task of exploiting parallelism easier • Either at the hardware level • See simultaneous multithreading • Or along protocol stack (simultaneous packet processing, self-organizing streams, etc.) • Uniformity at higher levels tends to promote uniform hardware designs • Since this is not custom, often cheaper to build this fast, scalable hardware
Finally… • Fixed length instructions help to align, fetch, prefetch, optimize, synchronize, reorder etc. • See the original 360 and Robert Tomasulo • Variable length instructions are more complex by design, • possibly requiring multiple cycles to fetch a longer instruction • And/or more trips across the bus to and from memory • All said and done, Ethernet LANs are just as convincing due to their speed, cost, success & adoption rate
ATM Features • Error detection is implemented at endpoints • End-to-end but not at each switch (i.e., at data link layer) • Congestion control • Admission control • If switches are completely reserved, decline connections
Framing in ATM • Fixed-size cells can make this easier • One Approach: use some SONET overhead to point to the start of the cell in the payload • Another Approach: CRC every 5 bytes • If you see no error, you’re likely at an ATM header • Repeat this approach looking for the same results every 53 bytes • See p.199 for the frame format
ATM & LAN Bridging • Not exaustive • ATM offers Qos features • ATM offers flow control, LANs are “best effort” • ATMS are conservative resource-wise • Connectionless protocols are minimalist • ATM can guarantee resources ahead of time • Useful esp. for voice-grade guarantees • Fixed length V.S. variable length packets • No broadcast (natively) V.S. only broadcast
Segmentation & Reassembly • Layers were built ontop of ATM to support other styles of networks and services • AAL 1-2 is for voice grade guaranteed bit rates • AAL 3-4 is for packet data over ATM • This requires S&R, since MTU for Ethernet >> 53B
Switch Congestion • When packets are being discarded frequently due to lack of resources • arrivalRate > sendRate + bufferSpace for some t