Reliable Transport and Code Distribution in Wireless Sensor Networks

Reliable Transport and Code Distribution in Wireless Sensor Networks Thanos Stathopoulos CS 213 Winter 04

Reliability: Introduction • Not an issue on wired networks • TCP does a good job • Link error rates are usually ~ 10-15 • No energy cost • However, WSNs have: • Low power radios • Error rates of up 30% or more • Limited range • Energy constraints • Retransmissions reduce lifetime of network • Limited storage • Buffer size cannot be too large • Highly application-specific requirements • No ‘single’ TCP-like solution

Approaches • Loss-tolerant algorithms • Leverage spatial and temporal redundancy • Good enough for some applications • But what about code updates? • Add retransmission mechanism • At the link layer (e.g. SMAC) • At the routing/transport layer • At the application layer • Hop-by-hop or end-to-end?

Relevant papers • PSFQ: A Reliable Transport Protocol for Wireless Sensor Networks • RMST: Reliable Data Transport in Sensor Networks • ESRT: Event-to-Sink Reliable Transport in Wireless Sensor Networks

PSFQ: Overview • Key ideas • Slow data distribution (pump slowly) • Quick error recovery (fetch quickly) • NACK-based • Data caching guarantees ordered delivery • Assumption: no congestion, losses due only to poor link quality • Goals • Ensure data delivery with minimum support from transport infrastructure • Minimize signaling overhead for detection/recovery operations • Operate correctly in poor link quality environments • Provide loose delay bounds for data delivery to all intended receivers • Operations • Pump • Fetch • Report

End-to-end considered harmful ? • Probability of reception degrades exponentially over multiple hops • Not an issue in the Internet • Serious problem if error rates are considerable • ACKs/NACKs are also affected

Proposed solution: Hop-by-Hop error recovery • Intermediate nodes now responsible for error detection and recovery • NACK-based loss detection probability is now constant • Not affected by network size (scalability) • Exponential decrease in end-to-end • Cost: Keeping state on each node • Potentially not as bad as it sounds! • Cluster/group based communication • Intermediates are usually receivers as well

Pump operation • Node broadcasts a packet to its neighbors every Tmin • Data cache used for duplicate suppression • Receiver checks for gaps in sequence numbers • If all is fine, it decrements TTL and schedules a transmission • Tmin < Ttransmit < Tmax • By delaying transmission, quick fetch operations are possible • Reduce redundant transmissions (don’t transmit if 4 or more nodes have forwarded the packet already) • Tmax can provide a loose delay bound for the last hop • D(n)=Tmax * (# of fragments) * (# of hops)

Fetch operation • Sequence number gap is detected • Node will send a NACK message upstream • ‘Window’ specifies range of sequence numbers missing • NACK receivers will randomize their transmissions to reduce redundancy • It will NOT forward any packets downstream • NACK scope is 1 hop • NACKs are generated every Tr if there are still gaps • Tr < Tmax • This is the pump/fetch ratio • NACKs can be cancelled if neighbors have sent similar NACKs

Proactive Fetch • Last segments of a file can get lost • Loss detection impossible; no ‘next’ segment exists! • Solution: timeouts (again) • Node enters ‘proactive fetch’ mode if last segment hasn’t been received and no packet has been delivered after Tpro • Timing must be right • Too early: wasted control messages • Too late: increased delivery latency for the entire file • Tpro = a * (Smax - Smin) * Tmax • A node will wait long enough until all upstream nodes have received all segments • If data cache isn’t infinite • Tpro = a * k * Tmax (Tpro is proportional to cache size)

Report Operation • Used as a feedback/monitoring mechanism • Only the last hop will respond immediately (create a new packet) • Other nodes will piggyback their state info when they receive the report reply • If there is no space left in the message, a new one will be created

Experimental results • Tmax= 0.3s, Tr = 0.1s • 100 30-byte packets sent • Exponential increase in delay happens at 11% loss rate or higher

PSFQ: Conclusion • Slow data dissemination, fast data recovery • All transmissions are broadcast • NACK-based, hop-by-hop recovery • End-to-end behaves poorly in lossy environments • NACKs are superior to ACKs in terms of energy savings • No out-of-order delivery allowed • Uses data caching extensively • Several timers and duplicate suppression mechanisms • Implementing any of those on motes is challenging (non-preemptive FIFO scheduler)

RMST: Overview • A transport layer protocol • Uses diffusion for routing • Selective NACK-based • Provides • Guaranteed delivery of all fragments • In-order delivery not guaranteed • Fragmentation/reassembly

Placement of reliability for data transport • RMST considers 3 layers • MAC • Transport • Application • Focus is on MAC and Transport

MAC Layer Choices • No ARQ • All transmissions are broadcast • No RTS/CTS or ACK • Reliability deferred to upper layers • Benefits: no control overhead, no erroneous path selection • ARQ always • All transmissions are unicast • RTS/CTS and ACKs used • One-to-many communication done via multiple unicasts • Benefits: packets traveling on established paths have high probability of delivery • Selective ARQ • Use broadcast for one-to-many and unicast for one-to-one • Data and control packets traveling on established paths are unicast • Route discovery uses broadcast

Transport Layer Choices • End-to-End Selective Request NACK • Loss detection happens only at sinks (endpoints) • Repair requests travel on reverse (multihop) path from sinks to sources • Hop-by-Hop Selective Request NACK • Each node along the path caches data • Loss detection happens at each node along the path • Repair requests sent to immediate neighbors • If data isn’t found in the caches, NACKs are forwarded to next hop towards source

Application Layer Choices • End-to-End Positive ACK • Sink requests a large data entity • Source fragments data • Sink keeps sending interests until all fragments have been received • Used only as a baseline

RMST details • Implemented as a Diffusion Filter • Takes advantage of Diffusion mechanisms for • Routing • Path recovery and repair • Adds • Fragmentation/reassembly management • Guaranteed delivery • Receivers responsible for fragment retransmission • Receivers aren’t necessarily end points • Caching or non-caching mode determines classification of node

RMST Details (cont’d) • NACKs triggered by • Sequence number gaps • Watchdog timer inspects fragment map periodically for holes that have aged for too long • Transmission timeouts • ‘Last fragment’ problem • NACKs propagate from sinks to sources • Unicast transmission • NACK is forwarded only if segment not found in local cache • Back-channel required to deliver NACKs to upstream neighbors

Evaluation • NS-2 simulation • 802.11 MAC • 21 nodes • single sink, single source • 6 hops • MAC ARQ set to 4 retries • Image size: 5k • 50 100-byte fragments • Total cost of sending the entire file: 87,818 bytes • Includes diffusion control message overhead • All results normalized to this value

Results: Baseline (no RMST) • ARQ and S-ARQ have high overhead when error rates are low • S-ARQ is better in terms of efficiency • Also helps with route selection • No ARQ results drop considerably as error rates increase • Exponential decay of end-to-end reliability mechanisms

Results: RMST with H-b-H Recovery and Caching • Slight improvement for ARQ and S-ARQ results over baseline • No ARQ is better even in the 10% error rate case • But, many more exploratory packets were sent before the route was established

Results: RMST with E-2-E Recovery • No ARQ doesn’t work for the 10% error rate case • Numerous holes that required NACKs couldn’t make it from source to sink without link-layer retransmissions • ARQ and S-ARQ results are statistically insignificant from H-b-H results • NACKs were very rare when any form of ARQ was used

Results: Performance under High Error Rates • No ARQ doesn’t work for the 30% error rate case • Diffusion control messages could not establish routes most of the time • In the 20% case, it took several minutes to establish routes

RMST: Conclusion • ARQ helps with unicast control and data packets • In high error-rate environments, routes cannot be established without ARQ • Route discovery packets shouldn’t use ARQ • Erroneous path selection can occur • RMST combines a NACK-based transport layer protocol with S-ARQ to achieve the best results

Congestion Control • Sensor networks are usually idle… • …Until an event occurs • High probability of channel overload • Information must reach users • Solution: congestion control

ESRT: Overview • Places interest on events, not individual pieces of data • Application-driven • Application defines what its desired event reporting rate should be • Includes a congestion-control element • Runs mainly on the sink • Main goal: Adjust reporting rate of sources to achieve optimal reliability requirements

Problem Definition • Assumption: • Detection of an event is related to number of packets received during a specific interval • Observed event reliability ri: • # of packets received in decision interval I • Desired event reliability R: • # of packets required for reliable event detection • Application-specific • Goal: configure the reporting rate of nodes • Achieve required event detection • Minimize energy consumption

Reliability vs Reporting frequency • Initially, reliability increases linearly with reporting frequency • There is an optimal reporting frequency (fmax), after which congestion occurs • Fmax decreases when the # of nodes increases

Characteristic Regions • n: normalized reliability indicator • (NC,LR): No congestion, Low reliability • f < fmax, n < 1-e • (NC, HR): No congestion, High reliability • f <= fmax, n < 1+e • (C, HR): Congestion, High reliability • f > fmax, n > 1 • (C, LR): Congestion, Low reliability • f < fmax, n <= 1 • OOR: Optimal Operating Region • f < fmax, 1-e <= n <= 1+e

Characteristic Regions

ESRT Requirements • Sink is powerful enough to reach all source nodes (i.e. single-hop) • Nodes must listen to the sink broadcast at the end of each decision interval and update their reporting rates • A congestion-detection mechanism is required

Congestion Detection and Reliability Level • Both done at the sink • Congestion: • Nodes monitor their buffer queues and inform the sink if overflow occurs • Reliability Level • Calculated by the sink at the end of each interval based on packets received

ESRT Protocol Operation • (NC, LR): • (NC, HR): • (C, HR): • (C, LR):

ESRT: Conclusion • Reliability notion is application-based • No delivery guarantees for individual packets • Reliability and congestion control achieved by changing the reporting rate of nodes • Pushes all complexity to the sink • Single-hop operation only

Code Distribution: Introduction • Nature of sensor networks • Expected to operate for long periods of time • Human intervention impractical or detrimental to sensing process • Nevertheless, code needs to be updated • Add new functionality • Incomplete knowledge of environment • Predicting right set of actions is not always feasible • Fix bugs • Maintenance

Approaches • Transfer the entire binary to the motes • Advantage • Maximum flexibility • Disadvantage • High energy cost due to large volume of data • Use a VM and transfer capsules • Advantage • Low energy cost • Disadvantages • Not as flexible as full binary update • VM required • Reliability is required regardless of approach

Papers • A Remote Code Update Mechanism for Wireless Sensor Networks • Trickle: A Self-Regulating Algorithm for Code Propagation and Maintenance in Wireless Sensor Networks

MOAP: Overview • Code distribution mechanism specifically targeted for Mica2 motes • Full binary updates • Multi-hop operation achieved through recursive single-hop broadcasts • Energy and memory efficient

Requirements and Properties of Code Distribution • The complete image must reach all nodes • Reliability mechanism required • If the image doesn’t fit in a single packet, it must be placed in stable storage until transfer is complete • Network lifetime shouldn’t be significantly reduced by the update operation • Memory and storage requirements should be moderate

Resource Prioritization • Energy: Most important resource • Radio operations are expensive • TX: 12 mA • RX: 4 mA • Stable storage (EEPROM) • Everything must be stored and Write()s are expensive • Memory usage • Static RAM • Only 4K available on current generation of motes • Code update mechanism should leave ample space for the real application • Program memory • MOAP must transfer itself • Large image size means more packets transmitted! • Latency • Updates don’t respond to real-time phenomena • Update rate is infrequent • Can be traded off for reduced energy usage

Design Choices • Dissemination protocol: How is data propagated? • All at once (flooding) • Fast • Low energy efficiency • Neighborhood-by-neighborhood (ripple) • Energy efficient • Slow • Reliability mechanism • Repair scope: local vs global • ACKs vs NACKs • Segment management • Indexing segments and gap detection: Memory hierarchy vs sliding window

Ripple Dissemination • Transfer data neighborhood-by-neighborhood • Single-hop • Recursively extended to multi-hop • Very few sources at each neighborhood • Preferably, only one • Receivers attempt to become sources when they have the entire image • Publish-subscribe interface prevents nodes from becoming sources if another source is present • Leverage the broadcast medium • If data transmission is in progress, a source will always be one hop away! • Allows local repairs • Increased latency

Reliability Mechanism • Loss responsibility lies on receiver • Only one node to keep track of (sender) • NACK-based • In line with IP multicast and WSN reliability schemes • Local scope • No need to route NACKs • Energy and complexity savings • All nodes will eventually have the same image

Retransmission Policies • Broadcast RREQ, no suppression • Simple • High probability of successful reception • Highly inefficient • Zero latency • Broadcast RREQ, suppression based on randomized timers • Quite efficient • Complex • Latency and successful reception based on randomization interval

Retransmission Policies (cont’d) • Broadcast RREQ, fixed reply probability • Simple • Good probability of successful reception • Latency depends on probability of reply • Average efficiency • Broadcast RREQ, adaptive reply probability • More complex than the static case • Similar latency/reception behavior • Unicast RREQ, single reply • Smallest probability of successful reception • Highest efficiency • Simple • Complexity increases if source fails • Zero latency • High latency if source fails

Segment Management: Discovering if a segment is present • No indexing • Nothing kept in RAM • Need to read from EEPROM to find if segment i is missing • Full indexing • Entire segment (bit)map is kept in RAM • Look at entry i (in RAM) to find if segment is missing • Partial indexing • Map kept in RAM • Each entry represents k consecutive segments • Combination of RAM and EEPROM lookup needed to find if segment i is missing

Segment Management (cont’d) • Hierarchical full indexing • First-level map kept in ram • Each entry points to a second-level map stored in EEPROM • Combination of RAM and EEPROM lookup needed to find if segment i is missing • Sliding window • Bitmap of up to w segments kept in RAM • Starting point: last segment received in order • RAM lookup • Limited out-of-order tolerance!

Retransmission Polices: Comparison

Reliable Transport and Code Distribution in Wireless Sensor Networks

Reliable Transport and Code Distribution in Wireless Sensor Networks

Presentation Transcript

Wireless Sensor Networks

Wireless Sensor Networks

Wireless Sensor Networks

Reliable Transport Layers in Wireless Networks

PORT: A Price-Oriented Reliable Transport Protocol for Wireless Sensor Networks

Wireless Sensor Networks

RCRT: Rate-Controlled Reliable Transport for Wireless Sensor Networks

Wireless Sensor Networks

Wireless Sensor Networks

Reliable Transfer on Wireless Sensor Networks

FBRT: A Feedback-Based Reliable Transport Protocol for Wireless Sensor Networks

Wireless Sensor Networks – A Reliable Distributed System

Reliable Data Transport in Wireless Networks

Transport Protocol in Wireless Sensor Networks

Efficient Bulk Transport and Mobility Control in Wireless Sensor Networks

Reconsidering Reliable Transport Protocol in Heterogeneous Wireless Networks

Rate-Controlled Reliable Transport for Wireless Sensor Networks

RCRT: Rate-Controlled Reliable Transport for Wireless Sensor Networks

Towards Reliable Wireless Sensor and Actuator Networks

Reliable Bursty Convergecast in Wireless Sensor Networks

Transport protocols in Wireless Sensor Networks

Reliable Data Transport over Heterogeneous Wireless Networks