360 likes | 490 Views
Fast Searching in Peer-to-Peer Networks. Self-Organizing Parallel Search Clusters Rocky Dunlap. Agenda. Peer-to-peer Networks Search Links/Index Links Model Parallel Search Clusters Self-Organizing Parallel Search Clusters Further Research. Peer-to-Peer Networks. Peer = Client + Server
E N D
Fast Searching in Peer-to-Peer Networks Self-Organizing Parallel Search Clusters Rocky Dunlap
Agenda • Peer-to-peer Networks • Search Links/Index Links Model • Parallel Search Clusters • Self-Organizing Parallel Search Clusters • Further Research
Peer-to-Peer Networks • Peer = Client + Server • Anyone can send/process messages • Highly Distributed • Highly Parallel • Data-centric routing
Unstructured “Loose” network structure Requires less control of peers (casual searching) Fault tolerance, churn Keyword searching Structured Specific network structure Distributed Hash Tables Smart routing Guarantees: Bounded hops Bounded state Ability to search entire network P2P Networks – Two Types
The Problems • Query saturation – every node processes every query • Query processing redundancy • Slow response time from distant nodes • In reality, cannot search entire network (TTL) • Need a model for studying P2P networks
Search Links (forwarding) SIL Model • Index Links (non-forwarding)
Searches remain inside cluster Index links provide full coverage Parallel Search Clusters
Parallel Search Clusters • Assumptions • Keep network essentially unstructured (keyword searching, fault tolerance) • Search rate is high • Update rate is low • Limit the number of nodes that processes query • Provide full (or high) coverage of network • Index links allow some nodes to proxy searches for others
The Challenge • Self-Organizing Parallel Search Clusters • Decentralized • Nodes only know a few neighbors • Dealing with “churn” • Minimal interruption of normal operations
Proposed Solution • Existing clusters split into two new clusters • Advantages • Solves origin problem (start with one cluster) • Clusters split autonomously • Automatic load balancing • Three phase approach • Color • Replicate Links • Split
Splitting Cluster ! Phase 1 Coloring
Splitting Cluster ! Phase 1 Coloring Color (radius = 2)
Splitting Cluster Phase 1 Coloring Color (radius = 2)
green red red red Splitting Cluster green red Phase 2 Replicate Links
red red red Splitting Cluster red Phase 2 Replicate Links
Splitting Cluster X Phase 3 Split
Splitting Cluster X Phase 3 Split
Splitting Cluster Phase 3 Split
X X X X X X X X Splitting Cluster Phase 3 Split
Splitting Cluster Phase 3 Split
Splitting Cluster Phase 3 Split
Further Research • Initiating the split • Choosing the radius for coloring phase • Want two clusters of same size • Overloading index links • Dealing with “churn” • Nice nodes • Not-so-nice nodes • Merge operation? • Simulation
Bibliography • B. F. Cooper and H. Garcia-Molina. SIL: Modeling and Measuring Scalable Peer-to-peer Search Networks. http://www-db.stanford.edu/~cooperb/pubs/searchnets.pdf, 2003. • B. Yang and H. Garcia-Molina. Improving Search in Peer-to-Peer Networks. http://dbpubs.stanford.edu:8090/pub/2002-28, 2002.