1 / 30

FeedTree: Sharing Web Micronews with Peer-to-Peer Event Notification

This presentation explores FeedTree, an alternative RSS distribution architecture that uses peer-to-peer technology to reduce network load. The architecture enables timely distribution and updates of micronews, while reducing the burden on content providers.

brownn
Download Presentation

FeedTree: Sharing Web Micronews with Peer-to-Peer Event Notification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FeedTree: Sharing Web Micronews with Peer-to-Peer Event Notification D. Sandler, A. Mislove,A. Post, P. Druschel Presented by: Andrew Sutton

  2. Contributions • Propose alternative to RSS distribution architecture • Use peer-to-peer technology to reduce network load

  3. RSS Distribution • RSS (Real Simple Syndication) - XML format for publishing micronews • Feed - a source of RSS items • Content Provider - responsible for publishing RSS feeds • Reader/Aggregator - user agent responsible for RSS acquisition and display

  4. RSS Distribution Network • Readers poll content providers • Request RSS files every ~30 minutes • Readers can be online, requesting 24/7

  5. Problems with Distribution • Polling - Requests occur on schedule • Superfluity - Full response per request • Stickiness - RSS traffic persists even if web traffic subsides • 24 Hour Traffic - requests occur all day long

  6. Network Load Example • Updates occur every 30 minutes • Slashdot • Subscribers: > 17,000 • RSS file size: ~15KB • ~11.6GB/Day of RSS data • Difficult to measure accurately • No reliable statistics

  7. Related Work • Improved Polling • Outsourced Aggregation

  8. Improved Polling • Improved Polling • Restrict reader polling via RSS • Use HTTP caching to reduce superfluous responses • Use compress to reduce response size • Delta Encoding • Only transmit what’s changed [RFC 3229] • Seemingly ideal for RSS

  9. Outsourced Aggregation • Content Providers supply RPC interface to aggregator • User readers query central server instead of providers

  10. Outsourcing Problems • Central aggregator allows • Single point of failure for readers • Censorship of original content • Modification of original content (i.e., ads) • May not be reliable or trustworthy

  11. FeedTree • Eliminate network/provider load • Uses peer-to-peer subscription • Use hybrid push/pull mechanism for timely distribution/update of micronews • Signed documents to enable trust

  12. FeedTree Architecture

  13. Pastry • Enables Peer-to-Peer networking applications • Self-organizing - nodes added, removed dynamically • Network overlay - efficiently routes messages in participating nodes • Applications: Scribe, SplitStream

  14. Overlay Network • Logical network built on top of actual network • Can define virtual routes between nodes • Common approach for P2P networks

  15. Pastry Network • Based on a circular namespace of node id’s (not tree-oriented) • Routing • Shortest-path based on routing • Non-receivers forward message to next-closest (proximity) node • Routes messages in O(logn) time

  16. Scribe • Group Communication and Event Notification • Highly dynamic groups (based on topics) • Uses publish/subscribe model • Allows application-level multicast and anycast • Applications: FeedTree, ???

  17. Scribe Multicast • Subscribing to a topic • Subscriber knows publisher’s node id • Sends “subscribe” message • Forwarding nodes become parents in the multi-cast tree (keeps track of children) • Notification of event • Events are multicast to all children of publisher, forwarders • One multicast tree per topic

  18. FeedTree Distribution • Subscription • Readers subscribe to a feed (i.e., Scribe topic) • Publication • Each item is given timestamp, sequence id • Document is signed with publishers private key

  19. FeedTree Delivery • Bootstrap Delivery • Signed RSS document is multicast to overlay network • Essentially, a combined subscribe/request operation • Incremental Delivery • Only new items are multicast • If no changes, multicast a “heartbeat”

  20. Missed Deliveries • If reader is missing sequence numbers • Query parent for missing items • Nodes must buffer last n items to make re-delivery more efficient • If items still missing, query publisher

  21. Publisher Delivery Tree

  22. Network Overhead • Assume an RSS feed generating 4KB/hour • Interior node in tree with 16 children forwards < 20B/sec • However… • Unknown how this scales for large providers, large readers

  23. Implementation • Implemented both publisher/reader software (proxies) • Created testbed website for real distribution of RSS feeds • No substantial experimentation http://www.feedtree.net

  24. Advantages/Disadvantages • Benefits - lower cost of delivering micronews • (Significantly) reduced provider load • No fear of being RSS feeds being “slashdotted” • Differentiated services - different feeds for headlines/full news

  25. Disadvantages • Requires specialized software for publishers/subscribers • P2P denial of service attacks • Malicious nodes may not forward events

  26. Conclusions • End users receive better service than currently possible • Foresee new services based on RSS • Storing every single RSS item published on the internet • Anonymous feeds using anonymizing p2p routing algorithms • Cooperative multicast to distribute realtime media

  27. Evaluation • Good • Appears to be well-reasoned idea • Developed software to test hypothesis • Good workshop paper • What’s needed for research • More detailed description of protocol • Substantiate claims about performance (i.e., experiment)

  28. Questions • List four problems with the current RSS feed distribution model. • Which two of these four problems have the largest impact on network load?

  29. Questions • How long does it take Pastry to route a message if there are n nodes in the network? • Suppose Slashdot has 50,000 RSS subscribers through FeedTree. What is the approximate depth of the multicast tree for the Slashdot topic?

  30. Questions • Assume that there are 100,000 FeedTree topics on a Pastry network that all update at 4KB/Hour. An interior node with 16 children will send 20B/sec. Suppose an interior node participates in all feeds. What is the expected output (in B/sec) of this node?

More Related