1 / 21

Peer-to-Peer Databases

Peer-to-Peer Databases. David Andersen Advanced Databases. What is Peer-to-Peer?. Shared Resources Each peer is a shares its resources with others, acting as both a client and server. Decentralization and Self-organization

arlo
Download Presentation

Peer-to-Peer Databases

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peer-to-Peer Databases David Andersen Advanced Databases

  2. What is Peer-to-Peer? • Shared Resources Each peer is a shares its resources with others, acting as both a client and server. • Decentralization and Self-organization Peers coordinate their activities with other peers rather than with a centralized server. • Autonomy Peers are free to come and go at will.

  3. Napster • Hybrid P2P • Data stored on peers, but a central server maintained index of file location. • File sharing - not a DBMS system.

  4. Gnutella • True P2P - Peer need only know one other peer to join. The Gnutella Protocol

  5. Gnutella • Uses Flooding Queries hop from peer-to-peer. A TTL (time-to-live) sent with the query prevents eternal searching. • Very High Bandwidth Usage. • File Sharing – Not DBMS

  6. P2P and Databases • Advantages • No Bottlenecks • Vast Resources Available • Improved Scalability • Improved Robustness • Less Management • Access to a tremendous amount of data

  7. P2P and Databases • Challenges • Coordinating Semantics • Query Processing Efficiency • Topology/Bandwidth Considerations • Indexing • Replication • Performing Updates and Avoiding Stale Data • Security - Access Control and Peer Reputation

  8. Case Study – Hyperion Project • Peers have a own local DBMS. • PeerDBMS layer augments the local DBMS to support peer-to-peer functionality. • Peers can form acquaintances. • Metadata is exchanged and the semantics of the peer acquaintance is mapped on the local system. • Uses Pair-wise Mappings to resolve queries.

  9. The Hyperion PDBMS • Query Service • Handles Local Queries • Uses Mapping Tables to Rewrite or Translate Queries destined for Remote Databases • Peer Coordination Service • Manages and Executes Updates • Uses Event-Condition-Action Rules

  10. The Hyperion PDBMS • P2P User Interface • Local and Peer Queries are posed through the interface • User is unaware of differing semantics at the peer • Peer Manager Messaging system to communicate with peers • Acquaintance Manager Manages exchange of schemas, mapping tables, and rules for updating data

  11. Hyperion Mapping Tables Table from Airline ‘A’ Table from Airline ‘B’ Mapping Tables

  12. Case Study – The Piazza Project Project Goals • Focus on developing query reformulation algorithms • Assist in defining mappings • Indexing • Enforcing access control

  13. Piazza Schema Mappings • Two types of mappings • Peer Description Relates two or more peer schemas Example: DBProjects:Member(pName, member) = UW:Member(mid, pid, member), UW:Project(pid, pName) • Storage Description Relates data stored in at a peer into peer’s view of the world. Example:UPenn:student(sid, name, advisor) UPenn:Student(sid, name), UPenn:Advisor(sid, fid), UPenn:Faculty(fid, advisor)

  14. Piazza Querying Reformulation Example

  15. Piazza Indexing • Challenge How to send a query to a peer most likely to have the answer and avoid flooding entire network. • Piazza attempts to index schema and value mappings. • Current implementation is centralized • Peers upload summaries of differing granularity of data they possess • Peers periodically refresh their data summaries at the index.

  16. Piazza Indexing • Peers upload attribute value pairs. • Index maintains a table of these pairs together with the object id of its origin. • Users query to the index and are returned the object which contains at least a partial match. • An example of an object that is indexed: s2 = [name = "Por%", age IN [50, 70], disease ="tuberculosis", type = "%"]

  17. Update Management • Data is often replicated with traditional distributed databases • Problem is to avoid reading stale data • Technique – Use Read Consensus and Write Consensus • Example: Write to majority before performing update and/or read to a majority and accept newest version.

  18. Update Management • Quorum Consensus can work with P2P too, but not with 100% guarantee because actual number of replications is not known, so setting a quorum very difficult. • Allow user to set quorum thresholds and accept the consequences of their decisions.

  19. Update Management • Trade-offs

  20. Questions?

  21. References • Flexible Update Management in Peer-to-Peer Database Systems,David Del Vecchio and Sang H. Son, Department of Computer Science, University of Virginia • An Overview on Peer-to-Peer Information Systems, Karl Aberer, Manfred Hauswirth, Swiss Federal Institute of Technology (EPFL), Switzerland • Data Sharing in the Hyperion Peer Database System, Patricia Rodríguez-Gianolli et al, Proceedings of the 31st VLDB Conference,Trondheim, Norway, 2005 • The Hyperion Project:From Data Integration to Data Coordination, Marcelo Arenas et al, SIGMOD Record, Vol. 32, No. 3, September 2003 • The Piazza Peer Data Management Project, Igor Tatarinov et al, SIGMOD Record, Vol. 32, No. 3, September 2003 • Distributed Query Processing in P2P Systems with incomplete schema information, Marcel Karnstedt, Katja Hose, Kai-Uwe Sattler, Department of Computer Science and Automation, TU Ilmenau P.O. Box 100565, D-98684 Ilmenau, Germany

More Related