570 likes | 681 Views
Distributed Partial Information Management (DPIM) for Survivable Networks. Dahai Xu. Content. Basic Concepts of Protection & Restoration Previous Work on Shared Path Protection Proposed DPIM Schemes what partial info to maintain and how?
E N D
Distributed Partial Information Management (DPIM) forSurvivable Networks Dahai Xu
Content • Basic Concepts of Protection & Restoration • Previous Work on Shared Path Protection • Proposed DPIM Schemes • what partial info to maintain and how? • how a connection is routed under distributed control and with partial info? • how distributed signaling is done and bandwidth (BW) allocated/deallocated? • A heuristic based on Potential Backup Cost
Protection • Path Protection • Link Protection • Advantages & Disadvantages
Path Protection • Use more than one path to guarantee the data be sent successfully • Dedicated Path Protection • Shared Path Protection
Dedicated Path Protection • 1+1 Protection • Point-to-Point Protection & Mesh Network Protection
Shared Path Protection • 1:N Protection • 1:1 Protection
Link Protection • Use an alternate path if the link failed • Dedicated Link Protection: not practical • Shared Link Protection: practical • It may fail when a node fails
Advantages & Disadvantages of Protection • Simple • Quick: Do not require much extra process time • Usually can only recover from single link fault • Inefficient usage of resource
Restoration • Path Restoration • Route can be computed after failure • Link Restoration • Path is discovered at the end nodes of the failed link • More practical than path restoration • Advantages & Disadvantages of Restoration • Usually can recover from multiplex element faults • More efficient usage of resource • Complex • Slow: require extra process time to setup path and reserve resource
Comparison between Protection & Restoration • Characteristic: Protection -- the resource are reserved before the failure, they may be not used; Restoration -- the resource are reserved and used after the failure • Route: Protection -- predetermined; Restoration -- can be dynamically computed • Resource Efficiency: Protection -- Low; Restoration -- High
Comparison between Protection & Restoration (Cont’) • Time used: Protection -- Short; Restoration -- Long • Reliability: Protection -- mainly for single fault; Restoration -- can survive under multiplex faults • Implementation: Protection -- Simple; Restoration -- Complex
Offline Routing • Arrange a set of traffic flows • Integer Linear Programming(ILP) to get optimal results • Heuristic Algorithms • Relaxation of ILP • Simulated Annealing - A stochastic hill-climbing heuristic search method. (Explore a larger area in the search space without being trapped in local optimal) • Genetic Algorithm: Evolves the current population of “good solutions” toward the optimality by using carefully designed crossover and mutation operators. • Tabu search
Online Routing of Bandwidth Guaranteed • Online routing, bandwidth guaranteed path with simultaneous protection path • Metrics • Unlimited Link Capacity • Bandwidth Consumption • Limited Link Capacity • Connection drop/block probability • Profit / Revenue
Assumption • Two connections whose active paths are completely link disjoint can share backup Bandwidth (BBW). • The objective of the algorithm is to exploit this BBW sharing to e.g., reduce the total amount of bandwidth (TBW) consumed by the connections.
Information for Routing • The amount of BBW sharing depends on the information available to the routing algorithm. • Three important cases to be considered. • No Information on how existing connections are routed • Complete Per-flow/Aggregate Information • Partial Aggregate Information
No Sharing (NS) • Only know the residual (available) bandwidth on each link • Residual bandwidth = Link capacity -Reserved active bandwidth (ABW) - Reserved backup bandwidth (BBW) • Can be obtained from OSPF Extensions or ISIS Extensions • Only the total used bandwidth is known (active + backup) • Can not share BBW, thus waste resources.
Sharing with Complete Information (SCI) • Know routes for the active and backup paths of all current connections. • May have too much information to maintain. O(LQ). L is the average path length, Q is the number of existing connections. • Permits the best sharing and provides a Performance upper-bound
Partial Information for Routing • Know some aggregated information of each link • Two schemes • SPI (Sharing with Partial Information): Centralized control, knows BBW and ABW on each/every link • DPIM (Distributed Partial Information Management): Distributed control, each ingress edge (source) node decides the routes.
No Sharing (NS) • Remove links Re < w • Determine two link disjoint paths for active/backup • Formulation: • standard network flow problem • each link has unit cost and unit capacity • s supply two units, d demand two units • minimum cost flow algorithm can be used
Linear Programming for SCI (I) • For new request (s, d, w), the least cost of using a on AP and b on BP • The cost of using e on BP (1)
Linear Programming for SCI (II) • Objective • Constraints
SPI • In SCI, can be calculated from per-flow information. Need maintain per-flow information. Not scalable. • In SPI, is not known, only is knownSame objective and constraints as in SCI • Further improvement to be discussed in DPIM
Survivable Routing (SR) • Distributed control with complete but aggregated information. • Every edge node essentially maintains a matrix of for all links a and b • Uses the active path first (APF) heuristic instead of ILP formulation • Remove links whose Re<w (temporarily) • Find a shortest path as AP • Put back temporarily removed links, remove AP links, calculate backup cost using Eq. (1) • Find a shortest (cheapest) path as BP
Successive SR (SSR) • After is updated as a result of setting up a new connection, some existing BPs may change (route and the amount of additional BBW reserved) • Such changes may in turn trigger changes to other existing BPs until an equilibrium state is reached • Achieve a better BBW sharing, but with a high signaling and control overhead
RAFT • RAFT: Resource Aggregation for Fault Tolerance • Each node maintains fault management table (FMT) , which list AP or BP flow on each link e. FMT must be updated each time a request initiates or terminates • AP and BP route are node-disjoint by using shortest path algorithm firstly • A request is accepted only if the bandwidth requirement is available on all the links on its AP and BP, otherwise it is rejected.
Doshi’s • Each node maintains a link capacity control table (LCCT) for each local link • Source nodes using Content-lock mechanism to avoid multiple demands deadlock. • BP route search: Distributed breadth-first search (BFS) over a residual network • In BFS, it first query the residual spare capacity in LCCT, only use the link if the link has sufficient capacity • If a route is found, the source node stores it as the restoration route for the demand. • If fail to find the BP route, the capacity optimization procedure is activated by changing previous BP routes
Su’s • Each node maintains “bucket”-based link state (equivalent to ) • The amount of link states is proportional to the number of failure/link, not the number of light paths • AP and BP are optimized separately. AP are assumed to using minimum-hop paths, BP are optimized to reduce the wavelength redundancy • The “width” of link l with respect to a failure event k* is defined as the normalized difference between the maximum bucket height and the bucket corresponding to link failure k*, which indicates the sharing capacity of links.
Su’s (Cont’) • By using Bellman-Ford algorithm to identify the widest path between the end nodes of the protected link, the path that offer the most sharing. • In the event that there are more than one such path candidates, the one that traverses the lease number links with width 0 was selected
DPIM-SAM • Distributed Partial Information Management • Edge node maintains (and exchanges) non-local information: for each link e. (O(E) information) • Each node also maintains profiles of ABW and BBW for each local link e. (O(E) information)
Path Determination • This estimated BBW may not be minimal • Using ILP, or APF to find AP and BP • DPIM-M-A: APF with Minimal BBW Allocation
Distributed Signaling • Minimal BBW Allocation • Maintaining Partial Information on AP and BP • Send AP Set-up packet containing BP to the nodes along AP, each node having an outgoing link e in AP updates • Similar way to update
Connection Release • Can’t be done efficiently in SPI • AP Tear-Down and BBW Deallocation. Update PBe and release bw.
Performance Evaluation • Traffic Types • Incremental traffic (Established connection lasts forever) • Dynamic traffic (with connection durations) • Performance Metrics • Unlimited Link Capacity • Bandwidth Saving (Ratio): upper bound 50% • Limited Link Capacity • Connection drop/block probability • Total Earning (Ratio) : Earning Rate matrix (independent of traffic load)
Simulation Results • Average Bandwidth Saving Ratio • Total Earning Ratio
Active Path First with Potential Backup Cost (APF-PBC) • Challenges • Integer Linear Programming (ILP) based approaches are notoriously time consuming • Guarantee minimal allocation of TBW for each request, but do not guarantee an optimal result for all requests. • Active path first (APF) can only achieve sub-optimal results: • Does not consider the potential cost along the BP when selecting the AP
Main idea of APF-PBC • Also uses Active Path First • In selecting Active Path, Each capable link a will be assigned a cost • We use as the potential backup cost (and try to minimize TBW). • Intuition: PBC increases with w and • Can apply to SCI and DPIM-SAM (which determine backup cost and BP differently)
Potential Backup Cost - Derivation • is derived based on the statistical analysis of experimental data. (SCI-ILP) for the 15-node network, infinite link capacity) • challenge: but do not know which link b to be used to backup link a, let alone Bb and • solution: guess the (weighted average) value of Bb (call it x) and (call it s)
Derivation based on statistical analysis of Bb • Distribution of Bb/M • (w,s,M) is the expected value of a(w) when s is fixed. • Guess the distribution of and calculated the weighted average value of (w,s,M) over all s to obtain a(w)
Graph of (w,s,M) & approximation • Integral (curves) from adaptive Lobatto quadrature • Approximation (line-fitting Y=c1X+c2)
Approximation of a(w) • Distribution of • Effect of constants c and on performance of APF-PBC