290 likes | 407 Views
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing. Ben Y.Zhao , John Kubiatowicz, and Anthony D,Josephetc. Computer Science Division University of California, Berkeley. Presenter: Chunyuan Liao March 6, 2002. Outline. Challenges System overview
E N D
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Ben Y.Zhao , John Kubiatowicz, and Anthony D,Josephetc. Computer Science Division University of California, Berkeley Presenter: Chunyuan Liao March 6, 2002
Outline • Challenges • System overview • Operations, concerned issues & solutions • Route • Locate • Publish • Insert • Delete • Move • Evaluation & Conclusion • Implementation • Summary & Comments
Project background • Driving force : Ubiquitous Computing • OceanStore – A data utility infrastructure • Goals: • Based on the current untrusted Infrastructure • Achieve Nomadic Data • Anytime, Anywhere • Highly scalable, reliable and fault-tolerant • Basic issues: • Data Location • Routing
Challenges • How to achieve naming, location and routing with a complex & chaotic computing environment • Dynamic nature • Mobile and replicated Data & Services • Complex interaction between components, even in motion • Traditional approaches • fail to address the extreme dynamic nature
Tapestry : An infrastructure forFault-tolerant wide-area Location and Routing • An overlay Location & Routing infrastructure built on the IP • Features • Highly scalable : Decentralized, Point-2-Point Self-Organizing • Highly fault-tolerant : Redundancy, Adaptation • Good locality Content-based routing&location • Highly durable
Basic Model of Tapestry • Originated in Plaxton Scheme • Basic components: • Nodes Servers Routers Clients • Objects Data or Services • Link Point-2-Point link
Operations in Trapestry • Naming • Routing • Object Location • Publishing Objects • Inserting/Deleting Objects • Mobile Objects
Tapestry - Naming • Node ID/Object ID • A fixed length bit string (4 bits in each level ) 84F8, 9098 • Global • Randomly generated • Location-Independent • Even distributed • Not unique ( shared by replicas )
Routing : Rules • Suffix matching ( similar to Plaxton ) • Incrementally routing digital by digital 7598 B4F8 Msg to 4598 4598 9098 6789 B437 • Maximum hops : logb(N)
Routing : Neighbor maps • A table with b*logb(N) entries • The i-th level neighbor share (i-1) suffix chunks • Entry( i, j ) • Pointer to the neighbor • “ j” + (i-1) suffix • Secondary Neighbors • Back Pointers • Create bi-direction link 0642
Routing : Fault-tolerant • Detect Server/Link failure • TCP time out( Ping ) • Periodic “heart beat” msg along back pointers • Resist fault • Secondary neighbor • Recover • Probing message • Second Chance
Locating : basic procedure • 4 phrases locating • Map the Object ID to a “virtual” Node ID • Route the request to that node • Arrive the surrogate or“root for the object • Direct to the server 6234 <O:1234,S:B346> B234 F734 8724 Surrogate Routing Server : B346 Client : B4F8 1234
Locating : Surrogate Routing(1) • Given any client at different place, how to find the same “root”? • Plaxton • Find the nodes with the maximum matching suffix (Stop at the empty entry in neighbor map) • Order them with the global knowledge • Choose the No.1 • Tapestry • Go further than Plaxton( choose an alternate entry ) • Stop at a neighbor map where there is only one non-empty entry pointed to node R 3. R is the root
Locating : Surrogate Routing(2) Conclusion: 1. Root can always be found 2. E. of Sur. Route is 2 Assumption: 1.Every node is reachable Ensure the same “patterns” 2.Even distributed ID Ensure less and less nodes in mapping table 51145 <O:12345, S:B3467> E1145 B1145 F3145 92145 B3467 12345 B7645 B3945
Publishing • Similar to locating • Server send msg and pretends to locate the object • Find the surrogate node as the “root” for the Obj. • Save the related info there, such as <O,S> 6234 <O:1234,S:B4F8> B234 F734 8724 Surrogate Routing Server :B4F8 1234
Locating/Publishing : Fault-Tolerant & Locality • Multiple “root” (better than Plaxton) • Map the Obj. ID to several “root” • Publish/Locate can be executed simultaneously • Cache 2-tuple <O,S> • Clients can get the <O,S> on the way to the root • Intermediate notes can receive multiple <O,S> for the same Obj., the nearest one is chosen
Insert a new node: basic procedure • Get an Node ID • Begin with a “Gateway node” G • Pretends to route to itself • Establish nearly optimal neighbor map during the “pseudo routing” by coping & Choosing nearest ones. • Go back and notify neighbors 6234 B234 F734 8724 Surrogate Routing Gateway node : B4F8 New node : 1234
Delete a note Most simple operation • Explicitly notify the neighbors with back pointers • Use Soft sate Don’t send “heart beat” messages and republish messages any more
Maintain System Consistency • Components in a Tapestry node • Neighbor map • Back pointers • Object-Location pointers <Object, Node> • Hotspot Monitor <Object, Node, Freq> • Object store • Main correct status • Soft sate • Proactive explicit update
Soft state • Advantage • Easy to implement • Suited to slowly changing systems • Disadvantage • Tradeoff between bandwidth overhead and level of consistency • Not suited to the fast changing systems • Example : Bytes for the republishing for a server can be 1400MB (!) in a single interval.
Proactive explicit update( PEU ) • Proactive explicit updates • Epoch number • sequence # of the rounds • Expanded 3-tuple • <Obj. ID, Server ID, LastHopID > • Soft state : backup resort
PEU : Node Mobility Root Deleting (123,A) with “LostHopID” C D * E Republishing (123,B) * F A B Move Object 123 from A to B
PEU : Recover location pointers Root Reconstruction (O,S,B) E F Deleting Old Data D C B A Server Exiting Notification
Introspective Optimization :Adapting to the changing environment • Load balance • Periodically Ping by refresher thread • Update neighbor pointers • Hotspot • Find the source of the heavy traffic, “Hotspot” • Pub the desired data near the hotspot
Evaluation • Gain • Good Locality • Low Location latency • High Stability • High Fault-tolerence • Cost • Bandwidth overhead linear to the replicas
Implementation • Packet level simulators are finished in C • Used to support other applications • such as OceanStore • Bayeus, application-level multicast protocol • Future Working • Security issues • Mobile-IP like functionality
Summary • Urgent need for new Location/Routing Scheme • Features of Tapestry • Location-independent naming • Integration of location and routing • Content-based routing • Support for the dynamic environment inserting/deleting/moving Node/Object
Comments and Questions • Paradox or discrepancy? The underlying IP has bad scalability, how can Tapestry achieve high scalability? Just for demo! • What’s the relation between the IP and Tapestry? Tapestry doesn’t intend to replace IP, it just tries to establish a higher level locating & routing infrastructure to support the content-based operation. • How can we achieve the same goal without IP?