460 likes | 801 Views
Deadlock and Livelock. A. Jantsch / I. Sander / Z. Lu zhonghai@kth.se. Deadlock. Deadlock can occur in an interconnection network, when a group of agents (usually packets) cannot make progress, because they are waiting on each other to release resource (buffers, channels)
E N D
Deadlock and Livelock A. Jantsch / I. Sander / Z. Lu zhonghai@kth.se
Deadlock • Deadlock can occur in an interconnection network, when a group of agents (usually packets) cannot make progress, because they are waiting on each other to release resource (buffers, channels) • If a sequence of waiting agents form a cycle the network is deadlocked SoC Architecture
Deadlock Example • Connection A holds channels u and v and wants to acquire channel w • Connection B holds channels w and x and wants to acquire channel u • Since neither connection A nor connection B will release their channels there is a deadlock in the network Circuit Switched Network SoC Architecture
Deadlock • Deadlock paralyzes the network, which can have catastrophic consequences • Two possible solutions • Avoid deadlocks • Recover from deadlocks • Almost all networks today use deadlock avoidance SoC Architecture
Agents and Resources • Depending on the type of connections different agents and resources are involved SoC Architecture
Wait-For and Hold-Relations • Agents and resources are related by wait-for and hold relations • Agent A • Holds resources u, v • Waits for resource w • Agent B • Holds resources w, x • Waits for resource w SoC Architecture
Wait-For and Hold-Relations • If an agent holds a resource than the resource can be viewed as “waiting” for the agent to release it • Thus each hold relation induces a wait-for relation in the opposite direction holds A u waits for A u SoC Architecture
Wait-For and Hold-Relations • Replacing the holds with wait-for in the opposite direction, the lower figure is generated • The arrows in the figure reveal a cycle that shows that the configuration is deadlocked SoC Architecture
Cycle in the wait-for graph means deadlock • Deadlock occurs, if • Agents hold and do not release a resource, while waiting for another • A cycle exists between waiting agents, such that there exists a set of agents A0, A1, …, An-1, where the agent Ai holds resource Ri while waiting for resource R(i+1 mod n) for i = 0, 1, …, n-1 SoC Architecture
Resource Dependences • Whenever it is possible for an agent holding Ri to wait on Ri+1, we say that a resource dependence exists from Ri to Ri+1, and denote it as Ri ≻Ri+1 SoC Architecture
Resource Dependences • A cycle in the resource dependence graph indicates that it is possible for a deadlock to occur • A cycle in this graph is a necessary, but not sufficient condition for deadlock u ≻ ≻ x v ≻ ≻ w Resource Dependence Graph SoC Architecture
Deadlock with packet-buffer flow control • Again the four node network is taken as example, but this time packet-buffer flow controlwith a single packet buffer per node is used • Agents are packets • Shared resources are buffers • A packet holds a buffer i while acquiring the next buffer (i+1) mod 4 • The resource dependence graph says that deadlock might occur B0 B1 B2 B3 Resource Dependence Graph SoC Architecture
Deadlock with packet-buffer flow control Wait-for Graph • The upper configuration is deadlocked • The lower one is not! Packet 3 can acquire buffer 0 P0 P1 P2 P3 B0 B1 B2 B3 Deadlock Configuration! P0 P1 P2 P3 B0 B1 B2 B3 Configuration not deadlocked SoC Architecture
Deadlock with flit-buffer flow control • Again the four node network is taken as example, but this time flit-buffer flow controlwith two virtual channels per physical channel is used • Agents are packets • Shared resources are virtual channels • The resource dependence graph says that deadlock might occur Resource Dependence Graph SoC Architecture
Deadlock with flit-buffer flow control • The example shows a deadlocked configuration • Packet P0 holds virtual channel u0 and v0 and tries to acquire w0 • Packet P1 holds virtual channel w0 and x0 and tries to acquire u0 • Though there are free virtual channel resources there is a cycle in the network SoC Architecture
Deadlock Avoidance • Deadlock can be avoided by eliminating cycles in the resource dependence graph • This can be done by imposing a partial order on the resources and then insisting, that agents take these resource in ascending order • Then there is no possibility for a cycle, since in any cycle at least one agent that holds a higher-number resource must wait on a lower-numbered resource, but this is not allowed by the ordering relation SoC Architecture
Deadlock Avoidance Techniques • Resource allocation • Distance classes • Dateline classes • Restrict physcial routes • Dimension-order routing • The turn model for k-ary-n-mesh networks • Topology-dependent SoC Architecture
Distance Classes • Resources are grouped into numbered classes and • Restrict allocation of resources so that they can be only acquired in ascending order SoC Architecture
Distance Classes Example • A packet at distance i from its source node needs to allocate a resource from class i • At the source, we inject packets into resource class 0. • Uphill-only resource allocation rule: At each hop, the packet acquires a resource of from the next higher class. Each node contains 5 buffer classes from bottom to up. As packets A and B progress, their buffer classes increase. SoC Architecture
Distance Classes Example • Using distance classes the resource dependence graph can look like this • There are no cycles => deadlock cannot occur! A four-node ring network using buffer classes based on distance. Each node i has 4 classes, with buffer Bji handling traffic at node i that has taken j hops toward its destination. SoC Architecture
Distance Classes • Distance classes provide a very general way to order resource in any topology • Distance Classes are very costly, since they require a number of buffers (or virtual channels) proportional to the diameter of the network • However, for some topologies the cost can be reduced significantly because of specific topology properties SoC Architecture
Dateline Classes • For a ring the number of needed classes can be much reduced • Each node has only two buffers • Class “0” buffer: B0i • Class “1” buffer: B1i • Packets enter the ring in node B0i • When they cross the dateline, they are placed into buffer B1i until they reach their destination • Result is an acyclic graph => Deadlock cannot occur! SoC Architecture
Restricted Physical Routes • Dividing the network into different classes allows to create a deadlock free network, but can be very costly to the large number of resources needed • An alternative is to restrict the routing function with the objective to generate a dependence graph that is acyclic SoC Architecture
Dimension Order Routing • Dimension Order Routing guarantees deadlock-freedom in k-ary n-meshes • Within the first dimension (here x) a packet traveling in +x/-x direction can only wait on a channel on the +x/-x, +y, and -y direction • In the second dimension a packet traveling on the +y/-y direction can only wait on a channel on the +y/-y direction • These relations can be used to number the channels, so that every packet follows increasingly numbered channels Enumeration of a 3x3 mesh in dimension-order routing Right-going channels are numbered first, then left, up and down SoC Architecture
The Turn Model • A more general model for Mesh-networks is the “Turn Model” • The eight possible turns in a 2D-Mesh can be combined to create 2 abstract cycles • In order to avoid deadlock at least one turn must be removed for each cycle . Counterclockwise (left turns) Clockwise (right turns) SoC Architecture
Deadlock Situation • Four packets travelling in different directions try to trun left and wind up in a cirular wait. • If any one of the packets had not turned, deadlock would have been avoided. • The algorithm should not prohibit more turns than necessary. Otherwise, its adaptiveness would be reduced. Source: “The turn model for Adaptive Routing” by J. Glass and M. NI SoC Architecture
Dimension Order Routing • Only the following turns are allowed (x-y routing) in dimension order routing (x-direction is routed first) Counterclockwise (left turns) Clockwise (right turns) SoC Architecture
The Turn Model • Assume we remove the N-W (+y,-x) turn in the counter-clockwise graph • In the clockwise direction either the S-W (-y,-x), N-E(+y, +x) or E-S (+x, -y) turn can be eliminated in order to yield a deadlock-free network, resulting in 3 deadlock-free algorithms. Counterclockwise North (+y) West (-x) East (+x) South (-y) Clockwise West-First North-Last Negative-Last S-W (-y,-x) N-E(+y, +x) E-S (+x, -y) SoC Architecture
The Three Deadlock-free Algorithms • West-first: Because all turns to the west are prohibited. A packet must start out in that direction in order to travel west. • North-last: Because all turns when travelling north are prohibited, a packet should only travel north when that is the last direction it needs to travel. • Negative-first:Because all turns from a positive direction to a negative direction are prohibited, a packet must start out in a negative direction in order to travel in a negative direction. Counterclockwise North (+y) West (-x) East (+x) South (-y) Clockwise West-First North-Last Negative-Last S-W (-y,-x) N-E(+y, +x) E-S (+x, -y) SoC Architecture
The Turn Model West First 1 • In the West(-x)-First model a packet has first to make all its west hops (1) • Packets shall be routed up (clockwise) or down (counterclockwise) (2) • Packet shall then be routed to the east (+x) (3) • Packets shall be routed down (clockwise) or up (counterclockwise) if needed (4) • This scheme shall be reflected by the numbering! 3 4 2 2 4 3 1 Channel ordering induced by the west-first turn model. SoC Architecture
The Turn ModelNot all turns are equal • Assume we remove the N-W (+y,-x) turn in the counter-clockwise graph • Eleminating turn W-N (-x, +y) is not deadlock free. Counterclockwise North (+y) West (-x) East (+x) South (-y) Clockwise West-First North-Last Negative-Last Disallowed S-W (-y,-x) N-E(+y, +x) E-S (+x, -y) W-N (-x, +y) SoC Architecture
Deadlock withAdaptive Routing • Adaptive routing networks may be deadlock free even in the presence of dependence cycles • There must be a non-zero probability for a packet to escape a dependence cycle • Deadlock free networks can be achieved more efficiently because fewer restrictions (on resource usage, on routes) are required SoC Architecture
Deadlock Recovery • Deadlock recovery needs less resources than deadlock avoidance, if it occurs rarely • A deadlock must be first detected • A cycle in the “wait-for” graph indicates a deadlock • Detection is often done by means of timeout counters • … and then the deadlock must be resolved • Either packets or connections are removed from the network • Packets that are deadlocked can enter an “escape buffer” that is used to resolve the deadlock • Worst case timing is often unbounded SoC Architecture
Livelock • In livelock packets continue to move through the network, but do not make progress to their destination • Lifelock occurs if packets are allowed to take non-minimal routes through a network • It can be avoided by limiting the number of times a packet can be misrouted • It occurs in dropping flow control, if a packet always gets dropped even after re-enter. SoC Architecture
Livelock • There are two main techniques to avoid livelock: • Deterministic Avoidance • A state is added to a packet to ensure its progress • Misroute count or age of packet • Packet with higher age or misroute count wins arbitration • Probabilistic Avoidance • If it can be guaranteed that the probability of packet delivery approaches one for infinite time there is a guarantee to avoid livelock • Network can be considered livelock free, if there is a non-zero probability of a packet moving towards its destination (and the sum of these probabilities must approach one for infinite time) SoC Architecture
Summary • Deadlock means no agent can move forward due to cyclic dependence on resources • Cycles in the resource dependency graph means deadlock is possible • Cycles in the wait-for graph means deadlock • Deadlocks can be avoided • by ordering resources • by restricting routes • Deadlock detection with timouts • Livelock in adaptive, non-minimal routing SoC Architecture