340 likes | 354 Views
Implicit group messaging. in peer-to-peer networks Daniel Cutting, 30th November 2005. Outline. Explicit and implicit groups P2P networks Implicit group messaging My approach Addressing Grouping Routing Evaluation. Explicit groups. Group “Griffin” Lois Peter Stewie Brian
E N D
Implicit group messaging in peer-to-peer networks Daniel Cutting, 30th November 2005
Outline. • Explicit and implicit groups • P2P networks • Implicit group messaging • My approach • Addressing • Grouping • Routing • Evaluation
Explicit groups. • Group “Griffin” • Lois • Peter • Stewie • Brian • Group “A-Team” • Homer • Peter
Implicit groups. • Implicit groups • Soccer • Soccer AND Argentina • Australia OR Argentina • (Soccer OR Football) AND Argentina
Explicit and implicit groups. • Explicit groups: members are named • Membership managed as a central list or a distributed structure (e.g. multicast group trees) • Members explicitly join (individually or by a coordinator) • Implicit groups: members are described • E.g. “Everyone interested in soccer” • Members don’t join, they just match the description
P2P networks. • Overlay networks of hosts (“peers”) on the Internet • Structured (CAN, Pastry): peers reside in a logical space and are connected in some ordered and consistent way • Unstructured (Gnutella, KaZaA): more ad hoc • P2P commonly used to swap files, but also good for: • Distributed data storage, academic collaboration tools, large multiplayer games, message forums • Want support for messaging groups within these networks (for searches, requests, events, etc.) • Implicit groups are good for these!
P2P networks. Hand of God (Soccer OR Football) AND Argentina
Implicit group messaging. • AIM: Deliver messages from any peer to any implicit group at any time in a P2P network • Assumptions: • Each peer describes itself with attributes (strings indicating capabilities, interests, services, …),e.g. “Soccer”, “Argentina” • Implicit groups are specified as logical expressions of attributes, e.g. “(Soccer OR Football) AND Argentina” • System delivers messages from a source to all peers matching the expression
My approach. • A fully distributed, structured overlay network • Peers build and maintain a logical Cartesian surface • Each peer resides at a logical address on surface • Each peer owns part of the surface and knows its neighbouring peers • Key features • Addressing: a peer’s address encodes its attributes • Grouping: a group’s description encodes all possible addresses of matching peers • Routing: source uses group description to reactively construct a multicast tree to all possible addresses
But what does it mean?! Peter {Australia} Lois {Soccer, Australia} Brian {Soccer} Homer {Football, Argentina, Donuts} Stewie {Soccer, Argentina}
JOINing the network. • New peer calculates its address • Routes a JOIN request to that address from a bootstrap • Peer that currently owns theaddress partitions its part ofthe surface • All neighbours are informed • To leave the network giveyour parts of the surfaceto your neighbours
Getting around on the surface. • Each peer knows its neighbours • When given a message with a destination address, pass it on to geometrically nearest neighbour
Addressing. • How does a peer determine its location on the surface? • Each peer has a set of attributes (its interests, say) • Encode these into the address of the peer using a Bloom Filter • E.g. {Soccer, Argentina} 01101 01100 | 01001 • Map the address to a part of the surface • Location on surface encodes attributes
Addressing. • Map from an address to the surface using a quadtree decomposition • Quadrants called extents • E.g. {Soccer, Argentina} 01101 • Pad with 0s at end • 01 10 1(0) = 122
Grouping. • How do we find a group of peers given a description? • E.g. all peers with attributes “Soccer AND Argentina” • Convert to a Bloom Filter, but wildcards (*) replace 0s • {Soccer, Argentina } *11*1 01100 | 01001 • So any peer with both attributes must have (at least) the 2nd, 3rd and 5th bits set in their address • The wildcards may match 1s or 0s depending on what other attributes the peer has • *11*1 matches addresses 01101, 11101, 01111, 11111
Grouping. • Need to find extents where the 2nd, 3rd and 5th bits are set • {Soccer, Argentina} *1 1* 1(*) • ** 00, 01, 10, 11(extents 0, 1, 2, 3) • *1 01 and 11 (1, 3) • 1* 10 and 11 (2, 3) • 11 just 11 (3)
Grouping. • ORs can be treated as a set of ANDs • E.g. “(Soccer OR Football) AND Argentina” • Equivalent to “(Soccer AND Argentina) OR (Football AND Argentina)” • {Soccer, Argentina} *11*1 01100 | 01001 • {Football, Argentina} 11*11 11010 | 01001 • All peers with address having 2nd, 3rd and 5th OR 1st, 2nd, 4th and 5th bits set are part of this group
Grouping. • {Soccer, Argentina} *1 1* 1(*) • {Football, Argentina} 11 *1 1(*) • ** 00, 01, 10, 11(extents 0, 1, 2, 3) • *1 01 and 11 (1, 3) • 1* 10 and 11 (2, 3) • 11 just 11 (3)
Routing. • A peer wants to send a message to an implicit group • Creates a message: “Got any Hand of God photos?” • Specifies an appropriate implicit group: “Soccer AND Argentina” • Chooses the best neighbour(s) to forward the message • Knows extents yet to be visited (everything initially) • Intersects these with extents matching group description • Clusters what’s left and sends a message towards each cluster
Routing. • If many targets in same direction, only route one copy: i.e. cluster based on their direction • Message splits as it gets closer to clusters since relative angles increase • Clustering threshold angle can be variable • Guarantees delivery
Evaluation. • Simulation • OMNeT++ implementation simulating campus- andworld-scale physical networks • Thousands of peers • In progress • Compare to alternative models • IP multicast flooding (optimised physical routing, but all peers receive all messages) • Centralised server (unfair to some peers/links but only member peers receive messages)
Evaluation. • Metrics • Normalised overall network traffic for messaging • Peer fairness (variance of computation and storage) • Network link fairness (variance of link stress) • Expected results • Flood should have good peer and link fairness, poor total traffic for small implicit groups • Centralised should have poor peer and link fairness, good total traffic for all groups • My approach should have good peer and link fairness, and good total traffic for small groups, poor for large
Evaluation. • Evaluation of basic features • Peer storage fairness • Average cost of unicast routing between two random addresses Peer fairness Unicast ROUTE cost
Related work. • Explicit group systems • IP multicast, Usenet (consumers explicit join channels) • Email (publisher lists recipients by name) • SCRIBE (on Pastry), CAN Multicast, Bayeux (on Tapestry) • Implicit group systems • Khambatti et al: interest-based communities (but don’t support arbitrary cross-cutting groups) • Interest management (virtual environment updates) • Content-based publish/subscribe (but different semantics)
Content-based publish/subscribe. • Conceptually similar: messages are delivered to implicit groups based on a match at time of publication • Pub/sub: consumers select the type of message they receive. Implicit group messaging: publishers select type of consumer of message • Converse semantics lead to differing expressiveness • Pub/sub good for consumers who need to be notified of specific types of events from any publisher: e.g. GUI components • Implicit group messaging good for publishers who need to reach specific types of consumer: e.g. distributed search engines
Future work. • Very important to get physical network simulations running • Testing with various attribute distributions, higher dimensional surfaces • Random shortcuts through the surface to reduce routing cost (can be inserted when peers JOIN) • Prefixing addresses with bits that place peers on surface with some approximation of underlying network may improve physical network usage
Attribute distribution. Uniform attribute distribution Peers with popular attributes {Soccer, Argentina, Sport} {Soccer, Argentina, Beer} {Rugby, Australia, Sport} Peers with unpopular attributes {Football, Argentina} Zipf attribute distribution
Research plan. • Technical Report awaiting Smart Internet Technology CRC approval • Short Letter expressing basic concepts in next week • Journal paper with network results by end of year • Conference paper with future work early next year • Complete around July
P2P networks. • Messaging in P2P networks is often many-to-many • E.g. any peer can initiate a multicast query to search for files or services • Typically handled by flooding (Gnutella), superpeer registries (KaZaA), plus many other shortcuts. • Some structured networks have multicast capabilities • Peers can subscribe to multicast channels and receive all messages sent to that channel • Need messaging between peers for: • Storing/retrieving data or files • Searching for particular data • Searching for particular kinds of peers
P2P networks. • P2P needs multicast (for searches, requests, events) • Allows a peer to send a message to a group of recipients • Often will know names of recipients, e.g. when some peers have explicitly requested notification of an event • However, there are times when it won’t, e.g. searching for peers matching some criteria • Often just flood the network, but may be more targeted • Difference is the way multicast groups are defined: explicitly or implicitly
Implicit group messaging. • In an ideal system: • All implicit group members should receive messages • Non-members shouldn’t receive them • Dynamic membership should be supported • Minimal total network load • Fairness across peers/network links