320 likes | 548 Views
Fault-Tolerant Routing: A Genetic Algorithm and CJC. Arjun Rao CS 717 November 18, 2004. Next Paper. [1] Loh, Peter K.K., “Artificial Intelligence Search Techniques as Fault-Tolerant Routing Strategies”
E N D
Fault-Tolerant Routing: A Genetic Algorithm and CJC Arjun Rao CS 717 November 18, 2004
Next Paper • [1] Loh, Peter K.K., “Artificial Intelligence Search Techniques as Fault-Tolerant Routing Strategies” • [2] Loh, Shaw., “A Genetic-Based Fault-Tolerant Routing Strategy for Multiprocessor Networks”
Our Little Problem… • AI search techniques topology- and fault-type independent… • …but non-minimal routes utilized • Follow-up work shows how genetic algorithms (combined with heuristics) can find minimal routes in presence of network faults
Genetic Algorithms: Overview • Optimization strategy • Population of potential solutions evolve over series of generations • Each element of population is chromosome; each unit of chromosome is gene • Chromosomes undergo crossover and mutation • Most fit chromosomes selected for next generation, based upon fitness function
Abstract Model • Same as before (including definitions of S and G) • Pure abstraction suffers from same caveats as before • Basic idea: Instead of AI search for adaptive route, optimize over population of routes to find best
Message Packets • Simplified version:
Chromosome • Route Chromosome • Node on route Gene in chromosome • Length of route Size of chromosome • Chromosome size directly reflects routing performance! • Distance traversed basis of fitness
Mutation and Crossover • Mutation: Swap and/or shift • Normal crossover destroys routes, messes with source and destination; problem w/ different lengths • Use one-point random crossover
Fitness Function • F = (Dmax – Droute) / Dmax + • Dmax:Maximum distance between source and destination • Droute: Distance traveled by specific route • : Predefined value to ensure non-zero fitness • Higher value More fit
Selection Scheme • Roulette Wheel • Sum of fitness values * random value from [0,1] • Select chromosomes until fitness sum greater than product • Tournament Selection • Most fit chromosomes selected • Stochastic Remainder • Probabilities used to select route • Which scheme has best performance selecting optimal route?
Performance Testing • RW, TS, SR tested in concert with RR and HCR (previous algorithm was for RR) • 25-node mesh network • Varying fault percentages • Ten tests, each over 20 generations • Variations in mutation and crossover rates
Performance Testing (cont.) • Randomness of RR bad for rerouting • Unsuccessful routes increase with number of generations • SR + HCR performed best (prevented premature convergence, maintained good diversity)
Next (and Final) Paper • [3] Loh, Schröder, Hsu., “Fault-Tolerant Routing on Complete Josephus Cubes” (not AI/GA-related but interesting nevertheless)
What is a Josephus Cube? • New network topology • Better embedding, communications performance than hypercube topologies • Can be used for processors, node clusters w/ optical channel architectures • In clusters, fault tolerance sacrificed for scalability • Symmetry (which is good for routing) somewhat sacrificed as well
Complete Josephus Cube (CJC) • Augmented (i.e. more links) Josephus Cube • Better fault-tolerance • Routing efficiency maintained! • Properties • Uniform node degree of log2N + 2 • N = 2r, r is order of network • Smaller diameter • Better symmetry • Guaranteed message delivery (w/ up to r + 1 faults) • No deadlocks/livelocks • We examine CJC as node cluster topology
Defining the CJC • CJC(N) is an undirected graph G = (V, E) • Nodes labeled using function J: • J(1) = 1 • J(2i) = 2J(i) – 1, i 1 • J(2i + 1) = 2J(i) + 1, i 1
Defining the CJC (cont.) • V = {u | J(2r) u J(2r + 1 – 1)} • Three types of edges: H, J, C • E = EH EJ EC • (x,y) EH iff H(x,y) = 1, where H is the Hamming distance • (x,y) EJ iff y = x XOR 6 • (x,y) EC iff y = (x), where finds 2’s complement
Data Vectors • Route vector Rv(u) = (Tr+1Tr…T1T0) • Information on paths already traversed • Tr+1 = 1 if |u v| > CEIL(r / 2) else 0 • Tr = 1 if v = u(J), where u(J) is 2’s complement of lowest order bits • Tk = (uk+1 vk+1), 0 k < r
Data Vectors (cont.) • Input link vector Iv(w) = (Lr+1Lr…L1L0) • Li = 0 if message arrives on link i else 1 • Fault status vector Fv(w) = (Sr+1Sr…S1S0) • Sj = 0 if node on link j or link j itself faulty else 1 • Navigation vector Nv(u) = Rv(u) Iv(u) Fv(w) • Designates optimal (or best-possible) paths from current node
CJC-FTROUTE() Algorithm • Extract route vector from packet header • Update current node’s fault status vector • If route vector all 0’s, destination reached • (Re)Initialize input link vector • Compute navigation vector Nv(u) • If enabled C link not faulty, set Rvr+1(u) to 0, take 2’s complement of Rv(u), and route packet • Else, do the same with the J link (link r), except complement Rv0(u) and Rv1(u) only • If both C and J links faulty and/or not enabled, find an H link for routing (or misrouting), set Rvr+1(u) and Rvr(u) to 0, and either route or discard
Routing With Bounded Faults • Theorem: Routing between a node pair (x, y) is optimal if FT < (r + 2) which includes faulty J and C incident links at x • Proof: Let fifaults be encountered at node wi on path of length p. |R(wi)| - fioptimal links left to y; routing from wito wi+1 optimal if link or |R(wi)| - fi> 0. Generalizing to whole path:
Maximum Path Distance • Define FTR(u, v) = Set of edges of path from u to v • Corollary: FTR guarantees path distance of route upper-bounded by H(x, y) if FT < (r + 2) • Corollary: FTR guarantees max distance r in order r cluster • Proof: Follows from above corollary with H(x, y) r
Final Set of Theorems • For sub-optimal routing, FTR guarantees max path distance of 2r – H(x, y) + 2 only if FT < (r + 2) (including disabled C and J links at x) • FTR guarantees deadlock-free routes • FTR guarantees livelock-free routes • Proof of latter two utilizes notion of virtual networks in CJC