250 likes | 372 Views
Lecture XVII: Distributed Systems Algorithms Inspired by Biology. CMPT 431 Dr. Alexandra Fedorova. Problem Statement. Load balancing in telecommunication networks Calls originate and end nodes and are destined to end nodes
E N D
Lecture XVII: Distributed Systems Algorithms Inspired by Biology CMPT 431 Dr. Alexandra Fedorova
Problem Statement • Load balancing in telecommunication networks • Calls originate and end nodes and are destined to end nodes • Calls are routed through intermediate switching stations or nodes • Each node has a certain capacity – can support only a limited number of calls routed through it • Many routes for each call • Routing tables determine the route • If the call is routed via a congested node, it must be dropped • Goal: construct routing tables that minimize the number of dropped calls under changing load conditions
Potential Solutions • Central controller: knows about the entire system, updates routing tables at nodes • Nodes must communicate with the controller • The controller is a single point of failure • Use shortest-path routing • Determine the shortest path from each source to each destination • Construct routing tables to reflect shortest path routes (this can be done because network topology does not change) • This will occupy the fewest nodes for each call, but will not necessarily result in routing along the least congested path • Mobile agents • Software agents (worms) move from node to node. Update routing tables based on their observations of the network
Structure of the Paper • Schoonderwoerd et al. Ant-based load balancing in telecommunications networks • Present a new solution – a new kind of distributed mobile agent • Behaviour inspired by that observed in colonies of ants • Evaluate • A simulated network • Measure the rate of dropped calls • Compare with • A different kind of mobile agent • Static routing table
Inspired by Nature • Ants are silly animals that accomplish sophisticated results as a team • Regulating nests temperature within limits of 1˚C • Forming bridges • Raiding particular areas for food • Building and protecting their nest • Cooperating in carrying large items • Finding the shortest routes from the nest to a food source • Mobile agents: we want them to be silly (i.e., simple), but accomplish sophisticated things (load balancing in the communications network)
How Ants Cooperate • Stigmetry – indirect communication through the environment • Produce specific actions in response to local environmental stimuli • These actions in turn affect the environment • The modified environmental stimuli affect actions of the ants that come to that location • Sematectonic stigmetry • Produce the environmental change: i.e., deposit a ball of mud • Causes other ants to repeat the action, i.e., deposit another ball of mud • Sign-based stigmetry • Deposit pheromones (smelly substances) that cause other ants to behave differently, responding to the presence of pheromones
Example: Laying a Trail (cont.) • Ants lay pheromones as they travel along a trail • A trail’s strength is determined by the amount of pheromones on the trail • Amount of pheromones depends on: • The rate at which pheromones are laid • The amount of pheromones laid – how many ants laid them • How much time has passed since the pheromones were last laid (pheromones evaporate over time) • If many ants follow along the same trail the total amount of pheromones is high – the trail’s strength is high: • Rate of deposit is high • Pheromones laying is recent
Example: Laying a Trail (cont.) Ants started on the right Ants started on the left Shorter path has more pheromones
ABC: Ant-Based Control • Routing tables are replaced with pheromone tables • Each node in the network has a pheromone table for every other node • Each table has an entry for each neighbour, indicating the probability of using that neighbour as the next hop • Pheromone laying is updating probabilities
Updating Pheromone Tables • At every time step ants can be launched from any node in the network • The destination node is random • Ants move from node to node, selecting the next node according to pheromone tables for their destination node • At each node they update probabilities of the entry corresponding to their source node • They increase the probability associated with the node where they came from
Updating Pheromone Tables (cont.) destination current location 2 source 1 3 4 Update routing table at node 1 for node 3 increase by Δp the probability of taking 4 as next hop
Ageing and Delaying Ants • Recall the system’s objectives: • Find routes that are short; avoid routes that are congested • This is accomplished by ageing and delaying ants • Ageing ants: • Age: the number of time steps the ant has travelled • Δp (the amount by which you increase the probability) reduces progressively with the age of the ant • This biases the system to “trust” ants who use shorter trails • Delaying ants: • Delay ants at nodes that are congested • Degree of delay correlated with the degree of congestion • This increases the age of ants travelling through congested nodes, so their pheromones have a smaller influence on pheromone tables • Delays updates to pheromone tables leading to congested nodes
Routing Calls in ABC Network • Route call to destination D • At the current node, look up the pheromone table for node D • Choose the neighbour corresponding to the highest probability in the table • Use that node as the next hop • The call is placed if the route is not congested, otherwise the call is dropped
Potential Problems • Blocking problem • An available route is suddenly blocked • It may take a while to find a new route • Shortcut problem • A better route becomes available • It may take a while to adapt to the new route
Solving Blocking And Shortcut Problems • Add a noise factor to ants movement protocol • With probability f ant chooses a random path • This ensures that • Useless routes are used occasionally (so they can be rediscovered if they suddenly become good) • Encourage more rapid discovery of a new route (if it becomes available)
ABC: Putting it All Together • Ants are regularly launched with random destinations on every part of the system • Ants walk according to probabilities in pheromone tables from their destination • Ants update the probabilities in the pheromone table for their source location • They increase the probability of selecting their previous node on the path as the next hop (to their source node) • The increase in probability is a decreasing function of the ant’s age • The ants are delayed on parts of the system that are congested
Other Mobile Agents • Mobile software agent • Load management agent • Parent agent • Travels from node to node • Updates routing table to find the least congested route • Two variations: • Largest minimum capacity (LMC) • Minimum sum of squared utilizations (MSSU)
Network Simulation • A software simulator • Node representation: • A node ID • A capacity – number of simultaneous calls that the node can handle (40) • Probability of being the end node (source or destination of a call) • Spare capacity • Routing table with n-1 entries, one for each node. A B D C Routing table at node C
Network Simulation (cont.) • Calls are generated by a traffic generator • Call parameters: source node, destination node, call duration (170 time steps average) • Call is routed using routing tables, spare capacity of intermediate nodes is reduced • If there is no spare capacity on the route, the call will fail
Experimental Setup • Call probability set: a particular distribution of calls • Adaptation period: run a load balancing mechanism • Test period: measure network performance for the number of dropped calls
Results: Percentage of Dropped Calls • What do these numbers indicate? • Which load balancing method performed the best?
Results (cont.) • Percentage of failed calls after stopping load balancing (call probabilities remain unchanged) • What does this tell us about the system?
Summary • In general ants performed better than other mobile agents • ABC system stores information not only about good current routes, but about good recent alternativeroutes • This allows it to adapt quickly to changes in network conditions • Ants consume less network resources than mobile agents (ants don’t need to store info about all nodes visited) • Ants can work concurrently without affecting each other; only one mobile agent can be active at once • A failure of ant does not hurt the system – other ants will update pheromone tables: the failure of mobile agent affects launching of future agents, so the failure has to be detected