Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Ben Y.Zhao , John Kubiatowicz, and Anthony D,Josephetc. Computer Science Division University of California, Berkeley Presenter: Chunyuan Liao March 6, 2002

Outline • Challenges • System overview • Operations, concerned issues & solutions • Route • Locate • Publish • Insert • Delete • Move • Evaluation & Conclusion • Implementation • Summary & Comments

Project background • Driving force : Ubiquitous Computing • OceanStore – A data utility infrastructure • Goals: • Based on the current untrusted Infrastructure • Achieve Nomadic Data • Anytime, Anywhere • Highly scalable, reliable and fault-tolerant • Basic issues: • Data Location • Routing

Challenges • How to achieve naming, location and routing with a complex & chaotic computing environment • Dynamic nature • Mobile and replicated Data & Services • Complex interaction between components, even in motion • Traditional approaches • fail to address the extreme dynamic nature

Tapestry : An infrastructure forFault-tolerant wide-area Location and Routing • An overlay Location & Routing infrastructure built on the IP • Features • Highly scalable : Decentralized, Point-2-Point Self-Organizing • Highly fault-tolerant : Redundancy, Adaptation • Good locality Content-based routing&location • Highly durable

Basic Model of Tapestry • Originated in Plaxton Scheme • Basic components: • Nodes Servers Routers Clients • Objects Data or Services • Link Point-2-Point link

Operations in Trapestry • Naming • Routing • Object Location • Publishing Objects • Inserting/Deleting Objects • Mobile Objects

Tapestry - Naming • Node ID/Object ID • A fixed length bit string (4 bits in each level ) 84F8, 9098 • Global • Randomly generated • Location-Independent • Even distributed • Not unique ( shared by replicas )

Routing : Rules • Suffix matching ( similar to Plaxton ) • Incrementally routing digital by digital 7598 B4F8 Msg to 4598 4598 9098 6789 B437 • Maximum hops : logb(N)

Routing : Neighbor maps • A table with b*logb(N) entries • The i-th level neighbor share (i-1) suffix chunks • Entry( i, j ) • Pointer to the neighbor • “ j” + (i-1) suffix • Secondary Neighbors • Back Pointers • Create bi-direction link 0642

Routing : Fault-tolerant • Detect Server/Link failure • TCP time out( Ping ) • Periodic “heart beat” msg along back pointers • Resist fault • Secondary neighbor • Recover • Probing message • Second Chance

Locating : basic procedure • 4 phrases locating • Map the Object ID to a “virtual” Node ID • Route the request to that node • Arrive the surrogate or“root for the object • Direct to the server 6234 <O:1234,S:B346> B234 F734 8724 Surrogate Routing Server : B346 Client : B4F8 1234

Locating : Surrogate Routing(1) • Given any client at different place, how to find the same “root”? • Plaxton • Find the nodes with the maximum matching suffix (Stop at the empty entry in neighbor map) • Order them with the global knowledge • Choose the No.1 • Tapestry • Go further than Plaxton( choose an alternate entry ) • Stop at a neighbor map where there is only one non-empty entry pointed to node R 3. R is the root

Locating : Surrogate Routing(2) Conclusion: 1. Root can always be found 2. E. of Sur. Route is 2 Assumption: 1.Every node is reachable Ensure the same “patterns” 2.Even distributed ID Ensure less and less nodes in mapping table 51145 <O:12345, S:B3467> E1145 B1145 F3145 92145 B3467 12345 B7645 B3945

Publishing • Similar to locating • Server send msg and pretends to locate the object • Find the surrogate node as the “root” for the Obj. • Save the related info there, such as <O,S> 6234 <O:1234,S:B4F8> B234 F734 8724 Surrogate Routing Server :B4F8 1234

Locating/Publishing : Fault-Tolerant & Locality • Multiple “root” (better than Plaxton) • Map the Obj. ID to several “root” • Publish/Locate can be executed simultaneously • Cache 2-tuple <O,S> • Clients can get the <O,S> on the way to the root • Intermediate notes can receive multiple <O,S> for the same Obj., the nearest one is chosen

Insert a new node: basic procedure • Get an Node ID • Begin with a “Gateway node” G • Pretends to route to itself • Establish nearly optimal neighbor map during the “pseudo routing” by coping & Choosing nearest ones. • Go back and notify neighbors 6234 B234 F734 8724 Surrogate Routing Gateway node : B4F8 New node : 1234

Delete a note Most simple operation • Explicitly notify the neighbors with back pointers • Use Soft sate Don’t send “heart beat” messages and republish messages any more

Maintain System Consistency • Components in a Tapestry node • Neighbor map • Back pointers • Object-Location pointers <Object, Node> • Hotspot Monitor <Object, Node, Freq> • Object store • Main correct status • Soft sate • Proactive explicit update

Soft state • Advantage • Easy to implement • Suited to slowly changing systems • Disadvantage • Tradeoff between bandwidth overhead and level of consistency • Not suited to the fast changing systems • Example : Bytes for the republishing for a server can be 1400MB (!) in a single interval.

Proactive explicit update( PEU ) • Proactive explicit updates • Epoch number • sequence # of the rounds • Expanded 3-tuple • <Obj. ID, Server ID, LastHopID > • Soft state : backup resort

PEU : Node Mobility Root Deleting (123,A) with “LostHopID” C D * E Republishing (123,B) * F A B Move Object 123 from A to B

PEU : Recover location pointers Root Reconstruction (O,S,B) E F Deleting Old Data D C B A Server Exiting Notification

Introspective Optimization :Adapting to the changing environment • Load balance • Periodically Ping by refresher thread • Update neighbor pointers • Hotspot • Find the source of the heavy traffic, “Hotspot” • Pub the desired data near the hotspot

Evaluation • Gain • Good Locality • Low Location latency • High Stability • High Fault-tolerence • Cost • Bandwidth overhead linear to the replicas

Implementation • Packet level simulators are finished in C • Used to support other applications • such as OceanStore • Bayeus, application-level multicast protocol • Future Working • Security issues • Mobile-IP like functionality

Summary • Urgent need for new Location/Routing Scheme • Features of Tapestry • Location-independent naming • Integration of location and routing • Content-based routing • Support for the dynamic environment inserting/deleting/moving Node/Object

Comments and Questions • Paradox or discrepancy? The underlying IP has bad scalability, how can Tapestry achieve high scalability? Just for demo! • What’s the relation between the IP and Tapestry? Tapestry doesn’t intend to replace IP, it just tries to establish a higher level locating & routing infrastructure to support the content-based operation. • How can we achieve the same goal without IP?

Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing

Presentation Transcript

Safety Critical Software Development

Wide Area Network (WAN)

Wide Area Networking

TIBURON PENINSULA

Localization

Chapter 3

THE TRAVELING WAVE FAULT LOCATION OF TRANSMISSION LINE WAVELET TRANSFORM

From Anonymity to Ubiquity: A Study of Our Increasing Reliance on Fault-Tolerant Computing

Ad Hoc Network Routing

Earthquake Location

Switching and Routing Technique

Weaving the Fabric of Our Faith

Fault tolerance

Routing Protocols for Sensor Networks

Routing Algorithms

Critical systems development

Tolerant IR

Ch. 7 – Distance Vector Routing Protocols Part 1 of 2: Distance Vector Routing and RIP

Scalable Dynamic Analysis for Automated Fault Location and Avoidance

Byzantine Techniques II

Chapter 13