270 likes | 557 Views
Advanced Computer Architecture CSE 8383. April 4, 2006 Session 21. Contents. Message Passing Systems (Chapters 5 & 7) Communication Patterns Client/Server Systems Clusters. Message Passing Mechanisms. Message Format Message arbitrary number of fixed length packets
E N D
Advanced Computer ArchitectureCSE 8383 April 4, 2006 Session 21
Contents • Message Passing Systems (Chapters 5 & 7) • Communication Patterns • Client/Server Systems • Clusters
Message Passing Mechanisms • Message Format • Message arbitrary number of fixed length packets • Packet basic unit containing destination address. Sequence number is needed • A packet can further be divided into flits (flow control digits) • Routing and sequence occupy header flit
Message, Packets, Flits Message Packet Destination Sequence Data flit
Store and Forward Routing • Packets are the basic units of information flow • Each node uses a packet buffer • A packet is transferred from S to D through a sequence of intermediate nodes • Channel and buffer must be available
Wormhole Routing • Flits are the basic units of information flow • Each node uses a flit buffer • Flits are transferred from S to D through a sequence of intermediate routers in order (Pipeline) • Can be visualized as a railroad train • Flits from different packets cannot be mixed up
Latency Analysis • L packet length (in bits) • W Channel bandwidth (bits/sec) • D Distance (number of hops) • F flit length (in bits)
Latency Analysis • L packet length (in bits) • W Channel bandwidth (bits/sec) • D Distance (number of hops) • F flit length (in bits) • TSF = D * L/W • TWH = L/W + D* F/W L/Wif L>>F (independent of D)
Communication Patterns • Point to Point 1 - 1 • Multicast 1 - n • Broadcast 1 - all • Conference n - n
Routing potential problems Deadlock: • When 2 messages, each is holding the resources required by the other in order to move, both messages will be blocked (cyclic dependency for resources) • Straightforward solution (but inefficient) is rerouting • Another solution is avoidance of occurrence of deadlock using a strict monotonic order of network resources • Channel dependency graph (CDG) is a technique for developing a deadlock-free routing algorithm.
0 1 3 2 c1 c1 c2 c3 c4 c5 c4 c6 c2 c8 c7 c8 c5 c6 c7 c3 A 4-node network and its CDGs (a) A 4-node network (b) Channel dependency graph (CDG) c2 c3 c4 c1 c5 c6 c7 c8 (c) CDG for a deadlock-free version of the network
Livelock: • A message goes around the network and never reaches its destination • It results from using adaptive routing algorithms with dynamic injection, where nodes inject their messages in the network at arbitrary times • Policies to avoid livelock are based on assigning a priority to a message injected to the network: • Messages are routed according to their priorities • Once a message is injected, only a finite number of messages will be injected with higher or equal priority.
Starvation: • A node suffers from starvation if it has a message to inject into the network but is never allowed to do so. • The simplest policy to avoid starvation is to allow each node to have an injection queue that competes with the queues of the incoming links to the same node. • The main disadvantage is that a node with a high message injection rate can slow down all the other nodes in the network.
Routing Efficiency • Two Parameters • Channel Traffic (number of channels used to deliver the message involved) • Communication Latency (distance)
Multicast on a mesh (5 unicasts) Traffic ? Latency ?
Multicast on a mesh (multicast pattern 1) Traffic ? Latency ?
Multicast on a mesh (multicast pattern 2) Traffic ? Latency ?
Broadcast (tree structure) 3 2 3 4 2 1 2 3 1 1 2
Message Passing in PVM (Revisit) Sending Task Receiving Task User application Library User application Library 5 1 4 8 6 2 3 7 Daemon Daemon
Client Client Server Server Threads Interconnection Network Client/Server Systems
Server 1 Server 2 Server 3 Server n Slaves (Workers) Interconnection Network Client Master (Supervisor) A Client Server Framework for Parallel Applications
Programming Environment and Tools Middleware Interconnection Network OS OS OS M M M I/O I/O I/O C C C P P P Clusters
Interconnection Network Data Rate Switching Routing Ethernet 10 Mbit/sec Packet Table-based Fast Ethernet 100 Mbit/sec Packet Table-based Gigabit Ethernet 1 Gbit/sec Packet Table-based Myrinet 1.28 Gbit/sec wormhole Source-path Quadrics 7.2 Gbyte/sec wormhole Source-path Interconnection Networks in Clusters
Port 4 Port 0 5 0 6 Port 5 Port 1 Port 6 Port 2 Port 7 Port 3 id 6 Dest-id Port 4 Port 0 Port 5 Port 1 Port 6 Port 2 Port 7 Port 3 Routing table Source-Path versus Table Based