Large-scale adaptive systems

Large-scale adaptive systems Lecture 4: Biology-inspired algs. Dr. Stefan Dulman s.o.dulman@tudelft.nl

Review previous lecture • Basic gossip algorithm • Applications • Firefly synchronization • Cooperation in selfish environments • Random sampling • Formation creation Large-scale adaptive systems, 2010

Related work • This lecture is based on the following material: • Design Patterns from Biology for Distributed ComputingOzalpBabaoglu et al. ACM Transactions on Autonomous and Adaptive Systems, 2006 Large-scale adaptive systems, 2010

Design focus • Large-scale networks • Wide area computer networks • Mobile ad hoc networks, etc. • Characteristics: dynamic, unreliable, large-scale • “Nice” properties of biological systems • Self-organization of large numbers of components • Robustness to failure of individual components • Adaptivity to changing conditions • Lack of reliance on explicit local coordination http://www.artfutura.org/02/05jansen_en.html Large-scale adaptive systems, 2010

Design patterns • “recurring solution to a standard problem” • Schmidt et al. 1996 • From extensive experience in solving classes of problems • Biology offers a large number of problems & solutions • Surviving organisms already include solutions • Same solution found in different species • Large scale distributed environments are a common characteristic • Study and use them as design patterns! Large-scale adaptive systems, 2010 Rolf Pfeifer and Christian Scheier – Understanding intelligence (MIT Press, 1999)

Common context • Design pattern • Name, context, problem, solution, example, design rationale • Context • Defined by the system model (participants, interactions, …) • Family of patterns • Biology vs. large-scale networks • Share commonalities in the communication structure • Allows “importing” of design patterns • Nodes, neighbors, asynchronous messages • Unreliable communication, topologies Large-scale adaptive systems, 2010

Evaluation • Living systems exhibit “nice properties” • Insensitivity • More specifically • Evaluation based on “figures of merit” and interaction with environment • Properties: • Scalability, robustness, adaptivity Large-scale adaptive systems, 2010

Lecture 4: Overview • Introduction • Mechanisms • Plain diffusion – aggregation • Replication – searching • Stigmergy – routing • Chemotaxis – load balancing • Summary Large-scale adaptive systems, 2010

Plain diffusion pattern • Problems • Assume: each node has value and • P1 – bring the network to the state • P2 – create a gradient on links indicating the differences in values • Solution • Mass conservation communication (see previous lecture) • For each link, each node sends a portion of its own value • Averaging occurs natural in time • The flow on each link creates the gradient Large-scale adaptive systems, 2010

Plain diffusion pattern • Design rationale • Diffusion process found in biological systems • Equalizing concentrations of chemicals, heat, electrical potentials • Concrete notions vs. abstract notions… • Very efficient in the context of both problems • Converging to a mean value (if mass is preserved) • Creating gradients (when mass is not preserved) • Possible mapping • Node – portion of space • Neighbor – topology of the space (2d or 3d grid) • Message – the actual material sent around (nonnegative number) Large-scale adaptive systems, 2010

Algorithm • PushSum – described in previous lecture • Main idea: keep the sum of all exchanged mass in a system constant Large-scale adaptive systems, 2010

Convergence speed vs. scaling Large-scale adaptive systems, 2010

Effects of errors Large-scale adaptive systems, 2010

Effects of directed links • Netlogo simulation • Model Library: • Networks: • Diffusion on a directed network Large-scale adaptive systems, 2010

Replication pattern • Problem: • Assume each node holds a specific value • P1: the value of a given node must reach all network (update) • P2: all nodes must hold the maximum of all values (max) • P3: find a node holding a specific value (query) • Solution • Flooding alike mechanism combined with filters on each node • Numerous optimizations possible: • Topology, number of times information is received • Timing of received information • … Large-scale adaptive systems, 2010

Replication pattern • Design rationale • Replication is largely available in nature • Ex: growth processes, epidemic spreading, signal propagation in certain neural networks, proliferation processes in immune systems • Possible mapping • Node – potential host of a virus • Neighbor – physical proximity, social contact • Message – the virus information; may suffer mutations Large-scale adaptive systems, 2010

Example: searching • MANETs equivalent to distributed databases • Querying implies replication of the query at each node • Launching a query = resources • Desires: high-efficiency and low-overhead • Structured overlay networks • Distributed hash tables – position of the node links it to information • Unstructured overlay networks • No relationship between a node and its content • Easy to maintain, highly robust to failures and churn • Searching mechanisms • Flooding (unbridled replication), k-random walkers, proliferation Large-scale adaptive systems, 2010

Proliferation (controlled replication) • Mechanism is inspired from humoral immune system • B cells generate antibodies when stimulated by an antigen • Proliferation increases the number of antibodies chasing antigens • Idea: • Antigens = items to be found • Antibodies = queries • Algorithms • Random walk (RW) • Query is forwarded to random neighbor • Proliferation (P) • Proliferation controlling function – compute number of copies • Proliferation increases with the increased correlation node content-query • Restricted versions (RRW, RP) Large-scale adaptive systems, 2010

Simulation model • Target: peer-to-peer network, Erdos-Renyi model • 10000 nodes, average node degree = 4 • Data distribution • Files = collection of keywords (2000 in total) • Each node holds a (probabilistic) number of keywords • D = {(δ1, n1), (δ2, n2), · · · } – data at each node • Query distribution • Query is a (probabilistic) set of keywords Q={q1, q2, · · · } • Definitions • Number of hits – how many keywords from a query are on a node • Proliferation controlling function – function of the correlation Q, D Large-scale adaptive systems, 2010

Network coverage parameters • Details • Simulations run 1000 times, average values presented on the graph • Fairness criterion applied • Parameters of the random walk tuned so that RW characteristics matches the one of proliferation Large-scale adaptive systems, 2010

Network coverage & search efficiency Article, Page 17-19 Large-scale adaptive systems, 2010

Search efficiency parameters • Setup • 100 individual searches repeated 100 times • One set of 100 experiments is called a generation • Each search is allowed to go for 50 time steps • Average hit rate computed Large-scale adaptive systems, 2010

Network coverage & search efficiency Article, Page 17-19 Large-scale adaptive systems, 2010

Discussion • Immune systems-inspired technique • More efficient than random walk and flooding • More cost-effective at covering large portions of the network • Other applications feasible: broadcasting and multicasting, etc. • On-going work – improved flooding procedure Large-scale adaptive systems, 2010

Stigmergy pattern • Problem • P1: find the shortest path between two nodes in a network • P2: redistribution of items across the nodes such that clustering of similar elements on nodes occur • Solution • Use of the stigmergy principle applied to local variables • Local decision policy is based on these variables • Periodic update in the direction of locally enforcing good decision • Distributed reinforcement learning processes • Difference with nature • Nature: mobile agents passing and marking a passive environment • Engineering: active nodes (environment) communicate passive messages Large-scale adaptive systems, 2010

Stigmergy pattern • Design rationale • Huge range of distributed self-organizing behaviors • Insects (nest building, labor division, path finding) to humans • Possible mapping (ant colony) • Node – idealized portion of space where pheromone is deposited • Neighbor – probability of ants to move between nodes • Messages – the ants themselves Websom, wordcloud Large-scale adaptive systems, 2010

Routing in MANETs • Challenges • Directing data flows S-D while maximizing netw. Performance • Dynamic topologies, churn, etc. • Possible solution • The usage of routing tables (discovery, maintenance) • Use stigmergy to maintain the routing tables • Example: AntHocNet [DiCarro 2005] • Several mechanisms at work • Stigmergy – adaptively discover routing paths • Diffusion – learning of the best paths • Replication – in different control phases of the protocol Large-scale adaptive systems, 2010

Inspiration from biology • Stigmergy via the use of pheromone • Several paths discovered between source and destination • Ants drop pheromone on the paths • Positive feedback controller • Step 1: Shortest path is traversed by ants more frequently • Increased pheromone level • Step 2: Increased pheromone level attracts more ants • Increased pheromone level • Result -> almost all ants take the shortest path • Similar ideas used in the context of Ant Colony Optimization Large-scale adaptive systems, 2010

Effects of directed links • Netlogo simulation • AC-based routing Large-scale adaptive systems, 2010

AntHocNet algorithm • AntHocNet • Proactive protocol • Nodes gather information only about S-D they are interested in • Reactive protocol • Nodes permanently check, repair and improve the routes • Protocol at a glance • Nodes send messages (ants) to sample/reinforce good paths (periodic) • Routing info is kept into pheromone tables (followed/updated) • Diffusion is used to exchange pheromone tables (speed-up) • Failures dealt with locally or end-to-end Large-scale adaptive systems, 2010

Components D • Routing tables as stigmergic variables • Goodness of path: combines nr. hops, end-to-end delay, channel quality • Lines in the table: goodness of path S-D via neighbor I • Reactive path setup • Assume S does not have D in its routing table • Ant is generated at S • Broadcasted on the nodes not having D in their table • Unicasted on nodes having D in their table • Each ant stores the path it takes • First ant to reach D is sent back • Paths are reinforced in the tables: delay, hopcount, RSSI • Timer used at S to account for process failing & restart S Large-scale adaptive systems, 2010

Components • Proactive path maintenance and exploration • Basic idea • Proactive ants are sent around periodically – test entries in tables • Negative: discovering new paths takes time and energy • Another idea • Nodes broadcast periodically Ds & pheromone level in their tables • Diffusion spreads information around • Bootstrapped pheromone info – “virtual” pheromone • Similar to Bellman-Ford routing mechanism • Fast, efficient, not very accurate though, may contain routing loops • Proactive ants check bootstrapped paths and mark them active Large-scale adaptive systems, 2010

Components • Stochastic data routing • Assume: • Node receives a piece of data to route to D • Node knows several paths to D • Node sends probabilistically data on each path • Probability function of the goodness • Data may be sent several times • Automatic load balancing is achieved in the network Large-scale adaptive systems, 2010

Simulation setup • Assumptions • 100 nodes deployed in a 2400m x 800m area • Radio communication range approx. 250m • Experiments run for 900 seconds • Constant bit rate data generation (20 src, 4 packets/sec) • IEEE 802.11b DCF protocol • Random waypoint model Large-scale adaptive systems, 2010

Results (1) Large-scale adaptive systems, 2010

Discussion • AntHocNet • Shows very good scalability and performance compared to SOA • Robust behavior in different environments • Shows that ideas from biology can lead to good results • But… • Engineering problems are not identical to the ant colony in nature • Dynamics in mobility lead to a special tailored “goodness” • Stochastic routing and exploration tailored to needs • Resource constraints lead to employment of diffusion • Combination of mechanisms can lead to powerful solutions Large-scale adaptive systems, 2010

Composite design patterns • Chemotaxis • Chemo – signal diffusion, Taxis – follow the gradient • The context is extended with the presence of plain diffusion • Problem: find the shortest path from a node to a region where concentration is minimal • Solution: follow the maximal gradient • Design rationale: micro-organisms follow gradients (patterns) • Signal needs to propagate faster than organisms can move • Reaction-diffusion processes Large-scale adaptive systems, 2010

Load balancing application • Idea • Use a diffusing signal to direct load to achieve an even distribution • Straight-forward approach • Each node having more load than capacity asks ngbrs. to take fraction of load • Diffusion spreads load over the area • Inefficient – diffusion is slow, traffic may be offloaded to already loaded areas, etc. • Novel approach • Use two diffusions in parallel • First: diffuse a signal denoting local load (one number) • Second: diffuse the traffic itself against the gradient Large-scale adaptive systems, 2010

Algorithms • Two algorithms presented in the paper • “version-6” – diffusion constant is a constant • “version-10” – diffusion constant adapts to degree of nodes • They are quite similar • Care is taken to prevent negative load • Signal is diffused “normally” – negative values can happen • Load is transferred using gradients from signals • Assumption: signal diffuses much faster than load Large-scale adaptive systems, 2010

Simulation setup • Setup • 10000 nodes graph • Power-law connectivity – max node degree = 2200 • 10000 units of load • At each moment in time, one load unit is placed on a random node Large-scale adaptive systems, 2010

Results Large-scale adaptive systems, 2010

Discussion • Combination of two mechanisms • Fast diffusion of signals • Slow diffusion of traffic • Equilibrium must be established to achieve convergence • Differential eq. model allows to compute it • Attention needs to be paid to “secondary” parameters • Initial distribution of the load (see v10 on previous graph) Large-scale adaptive systems, 2010

Lecture 4: Summary • Mechanisms • Plain diffusion – aggregation • Replication – searching • Stigmergy – routing • Chemotaxis – load balancing Large-scale adaptive systems, 2010

Large-scale adaptive systems