1 / 54

Querying the Internet with PIER (PIER = Peer-to-peer Information Exchange and Retrieval)

Querying the Internet with PIER (PIER = Peer-to-peer Information Exchange and Retrieval). What is PIER? . Peer-to-Peer Information Exchange and Retrieval Query engine that runs on top of P2P network step to the distributed query processing at a larger scale

haruko
Download Presentation

Querying the Internet with PIER (PIER = Peer-to-peer Information Exchange and Retrieval)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Querying the Internet with PIER(PIER = Peer-to-peer Information Exchange and Retrieval)

  2. What is PIER? • Peer-to-Peer Information Exchange and Retrieval • Query engine that runs on top of P2P network • step to the distributed query processing at a larger scale • way for massive distribution: querying heterogeneous data • Architecture meets traditional database query processing with recent peer-to-peer technologies

  3. Key goal is scalableindexing system for large-scale decentralized storage applications on the Internet • P2P, Large scale storage management systems (OceanStore, Publius), wide-area name resolution services

  4. Internet Scale 1000’s – Millions Single Site Clusters Distributed 10’s – 100’s What is Very Large?Depends on Who You Are Internet scale systems vs. hundred node systems • How to run DB style queries at Internet Scale! DatabaseCommunity Network Community

  5. What are the Key Properties? • Lots of data that is: • Naturally distributed (where it’s generated) • Centralized collection undesirable • Homogeneous in schema • Data is more useful when viewed as a whole

  6. Who Needs Internet Scale?Example 1: Filenames • Simple ubiquitous schemas: • Filenames, Sizes, ID3 tags • Born from early P2P systems such as Napster, Gnutella etc. • Content is shared by “normal” non-expert users… home users • Systems were built by a few individuals ‘in their garages’  Low barrier to entry

  7. Example 2: Network Traces • Schemas are mostly standardized: • IP, SMTP, HTTP, SNMP log formats • Network administrators are looking for patterns within their site AND with other sites: • DoS attacks cross administrative boundaries • Tracking virus/worm infections • Timeliness is very helpful • Might surprise you how useful it is: • Network bandwidth on PlanetLab (world-wide distributed research test bed) is mostly filled with people monitoring the network status

  8. Our Challenge • Our focus is on the challenge of scale: • Applications are homogeneous and distributed • Already have significant interest • Provide a flexible framework for a wide variety of applications

  9. Four Design Principles (I) • Relaxed Consistency • ACID transactions severely limits the scalability and availability of distributed databases • We provide best-effort results • Organic Scaling • Applications may start small, withouta priori knowledge of size

  10. Four Design Principles (II) • Natural habitat • No CREATE TABLE/INSERT • No “publish to web server” • Wrappers or gateways allow the information to be accessed where it is created • Standard Schemas via Grassroots software • Data is produced by widespread software providing a de-facto schema to utilize

  11. Declarative Queries Query Plan Overlay Network Physical Network >>based on Can

  12. Applications • P2P Databases Highly distributed and available data • Network Monitoring Intrusion detection Fingerprint queries 

  13. DHTs • Implemented with CAN (Content Addressable Network). • Node identified by hyper-rectangle in d-dimensional space • Key hashed to a point, stored in corresponding node. • Routing Table of neighbours is maintained. O(d)

  14. Key = (15,14) Data Given a message with an ID, route the message to the computer currently responsible for that ID (16,16) (16,0) (0,0) (0,16)

  15. DHT Design • Routing Layer Mapping for keys (-- dynamic as nodes leave and join) • Storage Manager DHT based data • Provider Storage access interface for higher levels

  16. DHT – Routing Routing layer maps a key into the IP address of the node currently responsible for that key. Provides exact lookups, callbacks higher levels when the set of keys has changed Routing layer API lookup(key)  ipaddr (Asynchronous Fnc) join(landmarkNode) leave() locationMapChange()

  17. DHT – Storage Storage Manager stores and retrieves records, which consist of key/value pairs. Keys are used to locate items and can be any data type or structure supported Storage Manager API store(key, item) retrieve(key) item remove(key)

  18. DHT – Provider (1) Provider ties routing and storage manager layers and provides an interface • Each object in the DHT has a namespace, resourceID and instanceID • DHT key = hash(namespace,resourceID) • namespace - application or group of object, table or relation • resourceID – primary key or any attribute(Object) • instanceID– integer, to separate items with the samenamespace and resourceID • Lifetime - item storage duration CAN’s mapping of resourceID/Object is equivalent to an index

  19. DHT – Provider (2) Provider API get(namespace, resourceID)  item put(namespace, resourceID, item, lifetime) renew(namespace, resourceID, instanceID, lifetime)  bool multicast(namespace, resourceID, item) lscan(namespace)  items newData(namespace, item)

  20. Query Processor • How it works? • performs selection, projection, joins, grouping, aggregation ->Operators • Operators push and pull data • simultaneous execution of multiple operators pipelined together • results are produced and queued as quick as possible • How it modifies data? • insert, update and delete different items via DHT interface • How it selects data to process? • dilated-reachable snapshot – data, published by reachable nodes at the query arrival time

  21. Join Algorithms • Limited Bandwidth • Symmetric Hash Join: - Rehashes both tables • Semi Joins: - Transfer only matching tuples • At 40% selectivity, bottleneck switches from computation nodes to query sites

  22. Future Research • Routing, Storage and Layering • Catalogs and Query Optimization • Hierarchical Aggregations • Range Predicates • Continuous Queries over Streams • Sharing between Queries • Semi-structured Data

  23. Distributed Hash Tables (DHTs) • What is a DHT? • Take an abstract ID space, and partition among a changing set of computers (nodes) • Given a message with an ID, route the message to the computer currently responsible for that ID • Can store messages at the nodes • This is like a “distributed hash table” • Provides a put()/get() API • Cheap maintenance when nodes come and go

  24. Distributed Hash Tables (DHTs) • Lots of effort is put into making DHTs better: • Scalable (thousands  millions of nodes) • Resilient to failure • Secure (anonymity, encryption, etc.) • Efficient (fast access with minimal state) • Load balanced • etc.

  25. PIER’s Three Uses for DHTs • Single elegant mechanism with many uses: • Search: Index • Like a hash index • Partitioning: Value (key)-based routing • Like Gamma/Volcano • Routing: Network routing for QP messages • Query dissemination • Bloom filters • Hierarchical QP operators (aggregation, join, etc) • Not clear there’s another substrate that supports all these uses

  26. Metrics • We are primarily interested in 3 metrics: • Answer quality (recall and precision) • Bandwidth utilization • Latency • Different DHTs provide different properties: • Resilience to failures (recovery time)  answer quality • Path length  bandwidth & latency • Path convergence  bandwidth & latency • Different QP Join Strategies: • Symmetric Hash Join, Fetch Matches, Symmetric Semi-Join, Bloom Filters, etc. • Big Picture: Tradeoff bandwidth (extra rehashing) and latency

  27. Symmetric Hash Join (SHJ)

  28. Fetch Matches (FM)

  29. Symmetric Semi Join (SSJ) • Both R and S are projected to save bandwidth • The complete R and S tuples are fetched in parallel to improve latency

  30. Overview • CAN is a distributed system that maps keys onto values • Keys hashed into d dimensional space • Interface: • insert(key, value) • retrieve(key)

  31. Overview y State of the system at time t Peer Resource Zone x In this 2 dimensional space a key is mapped to a point (x,y)

  32. DESIGN • D-dimensional Cartesian coordinate space (d-torus) • Every Node owns a distinct Zone • Map Key k1 onto a point p1 using a Uniform Hash function • (k1,v1) is stored at the node Nx that owns the zone with p1

  33. Node Maintains routing table with neighbors Ex: A Node holds{B,C,E,D} • Follow the straight line path through the Cartesian space

  34. Q(x,y) key Routing y • d-dimensional space with n zones • 2 zones are neighbor if d-1 dim overlap • Routing path of length: • Algorithm: Choose the neighbor nearest to the destination (x,y) Peer Q(x,y) Query/ Resource

  35. CAN: construction* Bootstrap node new node

  36. I CAN: construction Bootstrap node new node 1) Discover some node “I” already in CAN

  37. CAN: construction (x,y) I new node 2) Pick random point in space

  38. CAN: construction (x,y) J I new node 3) I routes to (x,y), discovers node J

  39. CAN: construction new J 4) split J’s zone in half… new owns one half

  40. Maintenance • Use zone takeover in case of failure or leaving of a node • Send your neighbor table to neighbors to inform that you are alive at discrete time interval t • If your neighbor does not send alive in time t, takeover its zone • Zone reassignment is needed

  41. Node Departure • Some one has to take over the Zone • Explicit hand over of the zone to one of its Neighbors • Merge to valid Zone if ”possible” • If not Possible ”then to Zones are temporary handled by the smallest neighbor

  42. Zone reassignment 1 3 1 3 2 4 4 2 Partition tree Zoning

  43. Zone reassignment 1 3 1 3 4 4 Partition tree Zoning

  44. Design Improvements • Multi-Dimension • Multi-Coordinate Spaces • Overloading the Zones • Multiple Hash Functions • Topologically Sensitive Construction • Uniform Partitioning • Caching

More Related