710 likes | 890 Views
On Fault Tolerance in Wireless Ad Hoc Networks. Seth Gilbert Nancy Lynch Celebration, 2008. Nancy Lynch. 2002-2008. 1997. 1994. Late 1980’s??. FLP : Impossibility of distributed consensus with one faulty process. Consistency. Fault tolerance.
E N D
On Fault Tolerance in Wireless Ad Hoc Networks Seth Gilbert Nancy Lynch Celebration, 2008
Nancy Lynch 2002-2008 1997 1994 Late 1980’s??
FLP: Impossibility of distributed consensus with one faulty process Consistency Fault tolerance DLS: Consensus in the Presence of Partial Synchrony Simulation Relations, Invariant-based Arguments Replication Timing LT: An Introduction to Input / Output Automata 2008 Formal Methods 2004 2000 1996 1992 1988 1984 • Increasingly complex, increasingly dyamic: • Group communication / membership • Publish / Subscribe • Peer-to-peer systems • Wireless ad hoc networks 1980
Papers: GeoQuorums: Implementing Atomic Memory in Mobile Ad Hoc Networks, DGLSW, DISC’03, DC’05 Virtual Mobile Nodes for Mobile Ad Hoc Networks,DGLSSW, DISC’03 Consensus and Collision Detectors in Wireless Ad Hoc Networks, CDGNN, PODC’05, DC’08 Timed Virtual Stationary Automata for Mobile Networks, DGLLN, Allerton’05, OPODIS’05 Autonomous Virtual Mobile Nodes, DGSSW, DIALM-POMC’05 A Middleware Framework for Robust Applications in Wireless Ad Hoc Networks, CDGN, Allerton’05 Reconciling the theory and practice of unreliable wireless broadcast, CDGLNN, ADSN’05 Self-Stabilizing Mobile Node Location Management and Message Routing, DLLN, SSS’05 Motion Coordination Using Virtual Nodes, LMN, CDC’05 The Virtual Node Layer: A Programming Abstraction for Wireless Sensor Networks, BGLNNS, WWWSNA’07 A Virtual Node-Based Tracking Algorithm for Mobile Networks, NL, ICDCS’07 Self-stabilization and Virtual Node Layer Emulations, NL, SSS’07 Secret Swarm Unit: Reactive k-Secret Sharing,DLY, IndoCrypt’07 Virtual Infrastructure for Collision-Prone Wireless Networks, CGL, PODC’08 Theses: Virtual Infrastructure for Wireless Ad Hoc Networks, G, PhD 2007 Air Traffic Control Using Virtual Stationary Automata, B, MEng 2007 Simulation and Evaluation of the Reactive Virtual Node Layer, S, MEng 2008 Virtual Stationary Timed Automata for Mobile Networks, N, PhD 2008 In Progress: Self-Stabilizing Robot Formations over Unreliable Networks, GLMN Using Virtual Infrastructure to Adapt Wireline Protocols to MANET, W Virtual Infrastructure Routing for Mobile Ad Hoc Networks, DN The Virtual Infrastructure Project
Scenarios: • Sensor networks • Social networks • Coordination Wireless Ad Hoc Networks
Scenarios: • Sensor networks • Social networks • Coordinated applications Wireless Ad Hoc Networks • environmental monitoring • intrusion detection • border monitoring • fire detection
Scenarios: • Sensor networks • Social networks • Coordinated applications Wireless Ad Hoc Networks • messaging • conferences / events • HikingNet • TrafficNet
Scenarios: • Sensor networks • Social networks • Coordination Wireless Ad Hoc Networks • emergency response & military • firefighting • police response • terrorism
Scenarios: • Sensor networks • Social networks • Coordination Wireless Ad Hoc Networks
Wireless ad hoc networks are really hard to use. • Unreliable communication • Unknown availability • Noise • Lost Messages • Collisions • Unknown topology • Fault prone • Dynamic • Unknown participants
Fixed Infrastructure • Deploy: • Base stations • Cell towers • Servers • Problems: • Too expensive • Not feasible
Network Layers Application Service Service Middleware Wireless Ad Hoc Network
Network Layers Application Routing Tracking Virtual Infrastructure Wireless Ad Hoc Network
Building Virtual Infrastructure Leader / backup Leader sends & receives messages for the virtual node Each participant is a replica. Replicas execute a consistency protocol
Today’s Questions • What is virtual infrastructure? • What can you do with it? • Dynamic distributed coordination. • Air traffic control • Does it really work? • Two simulation studies: routing and address allocation.
Dynamic Distributed Coordination • Challenging problem: • Highly dynamic environment • Unreliable network • Safety-critical applications • Ideal for Virtual Infrastructure solution: • Static overlay • Simpler, verifiable algorithms • Fate-sharing
Dynamic Distributed Coordination • Note: • Number of (non-failed) robots unknown. • Location of other robots unknown. • Pattern may change over time.
Dynamic Distributed Coordination In each round: All robots stop. All robots send location info. Coordinators exchange info. In each round: Coordinators calculate. Coordinators send out targets. Robots move to target.
Dynamic Distributed Coordination Rule 1: If only 1 robot, keep it.
Dynamic Distributed Coordination Rule 2: If not on the curve and no neighbors on the curve: distribute evenly all but one.
Dynamic Distributed Coordination Rule 3: If not on the curve: distribute among less populated neighbors on the curve.
Dynamic Distributed Coordination Rule 4: If on the curve: distribute among less dense neighbors on the curve.
Dynamic Distributed Coordination Rule 4: If on the curve: distribute among less dense neighbors on the curve.
Dynamic Distributed Coordination Rule 5: Distribute robots evenly on the curve in each region.
Dynamic Distributed Coordination Step 1: Eventually, robots cease moving from regions “off the curve” to regions “on the curve”. Step 2: If neighbor g is the most dense neighbor of u after time t, then u is less dense than g after time t+1. Step 3: Eventually, robots remain always in the same region.
Dynamic Distributed Coordination What happens when something goes wrong? Too many lost messages Too much churn INCONSISTENT REPLICAS Option 1: Design for the very, very worst case. Option 2: Design a system that can recover from faults.
Emulating Virtual Infrastructure Leader Election: • Heartbeats, timeouts • Resolve leader competitions Replica Consistency: • Leader sends “checksums” of the state. • If out-of-synch, then re-join.
Building Virtual Infrastructure Assume that: • A is a self-stabilizing algorithm. • A is designed for the virtual infrastructure abstraction. • A is executed with the emulator. • The system begins in an arbitrary (corrupt) state. Then if the system is eventually well-behaved: • From some point on, the state of A is as if it had really executed on a fixed infrastructure.
Dynamic Distributed Coordination • Coordination algorithm is self-stabilizing. • In each round, all state is recalculated. • Underlying virtual infrastructure emulation is self-stabilizing. • Implications: • Converges to changing curve. • Recovers from network instability, lost messages, etc.
Dynamic Distributed Coordination Tina Nolte Virtual Stationary Timed Automata for Mobile Networks PhD 2008
Dynamic Distributed Coordination Free Flight • No flight plan, no control towers! • Each pilot chooses a route independently. • More efficient: • Adapt to wind currents. • Avoid turbulence / bad weather.
Dynamic Distributed Coordination Goal: Free Flight • Each pilot chooses a route independently. • More efficient: • Adapt to wind currents. • Avoid turbulence / bad weather. In the USA, minimum separation: 3 miles lateral distance OR 1000 feet altitude
Dynamic Distributed Coordination Matthew D. Brown Air Traffic Control Using Virtual Stationary Automata MEng, 2008
Today’s Questions • What is virtual infrastructure? • What can you do with it? • Dynamic distributed coordination. • Does it really work? • Two simulation studies.
Study #1 Study #2 • Routing / Geocast • Custom-built simulator (python) • Simple communication model • Address allocation (i.e., DHCP) • ns2 simulator • 802.11 MAC layer Simulating Virtual Infrastructure
GeoCast Source Destination
GeoCast Source Destination
geocast geocast Location Service Target hash(id, 1) hash(id, 2)
geocast Location Service Target hash(id, 1) hash(id, 2) Source
Routing • Two step process: • Lookup destination location. • Geocast message to destination’s region.
250 m 400 m 400 m Simulation Setup • Number of devices: • 25 / 50 / 100 • Velocity: • 0-20 meters / second • Mobility model: • Random waypoint • Pause time: 100-900s • Simulation time: • 1000 seconds
250 m 400 m 400 m Simulation Setup • GeoCast: • 10 send/receive pairs • 1 msg every 5 secs • Routing • 10 send/receive pairs • 1 msg every 0.5 secs • 15 second simulation
Mobility and Density 100% 100 devices 50 devices 80% Percent of Time Non-Failed 25 devices 60% 40% 20% 200 400 600 800 Pause Time When density is sufficient, virtual nodes work.
10 8 Leadership Changes per Region 6 4 2 100 devices 200 400 600 800 Pause Time Leadership Changes There is continuous turn-over in the leader.
0.5 Heartbeat 0.4 Messages per Region per second 0.3 0.1 0.05 Join 0.01 Leader 200 400 600 800 Pause Time Message Overhead Most overhead is heartbeats. (Overhead is negligible.)