200 likes | 216 Views
Lecture 24: Interconnection Networks. Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5). RAID Summary. RAID 1-5 can tolerate a single fault – mirroring (RAID 1) has a 100% overhead, while parity (RAID 3, 4, 5) has modest overhead
E N D
Lecture 24: Interconnection Networks • Topics: communication latency, centralized and • decentralized switches (Sections 8.1 – 8.5)
RAID Summary • RAID 1-5 can tolerate a single fault – mirroring (RAID 1) • has a 100% overhead, while parity (RAID 3, 4, 5) has • modest overhead • Can tolerate multiple faults by having multiple check • functions – each additional check can cost an additional • disk (RAID 6) • RAID 6 and RAID 2 (memory-style ECC) are not • commercially employed
I/O Performance • Throughput (bandwidth) and response times (latency) • are the key performance metrics for I/O • The description of the hardware characterizes maximum • throughput and average response time (usually with no • queueing delays) • The description of the workload characterizes the “real” • throughput – corresponding to this throughput is an • average response time
Throughput Vs. Response Time • As load increases, throughput increases (as utilization is • high) – simultaneously, response times also go up as the • probability of having to wait for the service goes up: • trade-off between throughput and response time • In systems involving human interaction, there are three • relevant delays: data entry time, system response time, • and think time – studies have shown that improvements • in response time result in improvements in think time • better response time and much better throughput • Most benchmark suites try to determine throughput while • placing a restriction on response times
Estimating Response Time • Queueing theory provides results that can characterize • some random processes • Little’s Law: Mean number of tasks in system = • Arrival rate x mean response time • The following two results are true for workloads with • interarrival times that follow a Poisson distribution • P(k tasks arrive in time interval t) = e-a ak / k! • Timequeue = Timeserver x server utilization/(1-server utilization) • Lengthqueue = server utilization2 / (1 – server utilization)
Interconnect Types WAN LAN SAN Bus 1 10 100 1000 10000 Number of autonomous systems connected
Communication • Applications typically send messages through the OS • Steps to send a message: • copy data into an OS buffer • OS attaches header and trailer • OS sends transfer command to network interface hw • Steps to receive a message: • copy data from network interface hw into OS buffer • make sure data is correct, send acknowledgment • copy data into user space and signal the application • If sender does not see an ack in time, it resends the data
Latency • Sender overhead: the processing delays to send the • application data to the network interface hardware • Time of flight: latency for the first bit to travel from sender • to receiver: function of distance and speed of light • Transmission time: time between arrival of first and last • bit: function of message size and network bandwidth • Transport latency: sum of time of flight and transmission time • Receiver overhead: processing delays to send data from • hw to application: usually longer than sender overhead
Topologies • Internet topologies are not very regular – they grew • incrementally • Supercomputers have regular interconnect topologies • and trade off cost for high bandwidth • Nodes can be connected with • centralized switch: all nodes have input and output wires going to a centralized chip that internally handles all routing • decentralized switch: each node is connected to a switch that routes data to one of a few neighbors
Centralized Crossbar Switch P0 Crossbar switch P1 P2 P3 P4 P5 P6 P7
Centralized Crossbar Switch P0 P1 P2 P3 P4 P5 P6 P7
Crossbar Properties • Assuming each node has one input and one output, a • crossbar can provide maximum bandwidth: N messages • can be sent as long as there are N unique sources and • N unique destinations • Maximum overhead: WN2 internal switches, where W is • data width and N is number of nodes • To reduce overhead, use smaller switches as building • blocks – trade off overhead for lower effective bandwidth
Switch with Omega Network 000 P0 000 001 P1 001 010 P2 010 011 P3 011 100 P4 100 101 P5 101 110 P6 110 111 P7 111
Omega Network Properties • The switch complexity is now O(N log N) • Contention increases: P0 P5 and P1 P7 cannot • happen concurrently (this was possible in a crossbar) • To deal with contention, can increase the number of • levels (redundant paths) – by mirroring the network, we • can route from P0 to P5 via N intermediate nodes, while • increasing complexity by a factor of 2
Tree Network • Complexity is O(N) • Can yield low latencies when communicating with neighbors • Can build a fat tree by having multiple incoming and outgoing links P0 P1 P2 P3 P4 P5 P6 P7
Bisection Bandwidth • Split N nodes into two groups of N/2 nodes such that the • bandwidth between these two groups is minimum: that is • the bisection bandwidth • Why is it relevant: if traffic is completely random, the • probability of a message going across the two halves is • ½ – if all nodes send a message, the bisection • bandwidth will have to be N/2 • The concept of bisection bandwidth confirms that the • tree network is not suited for random traffic patterns, but • for localized traffic patterns
Distributed Switches: Ring • Each node is connected to a 3x3 switch that routes • messages between the node and its two neighbors • Effectively a repeated bus: multiple messages in transit • Disadvantage: bisection bandwidth of 2 and N/2 hops on • average
Distributed Switch Options • Performance can be increased by throwing more hardware • at the problem: fully-connected switches: every switch is • connected to every other switch: N2wiring complexity, • N2/4 bisection bandwidth • Most commercial designs adopt a point between the two • extremes (ring and fully-connected): • Grid: each node connects with its N, E, W, S neighbors • Torus: connections wrap around • Hypercube: links between nodes whose binary names differ in a single bit
Topology Examples Hypercube Grid Torus
Title • Bullet