1 / 26

Publiy + : A Peer-Assisted Publish/Subscribe Service for Timely Dissemination of Bulk Content

Publiy + : A Peer-Assisted Publish/Subscribe Service for Timely Dissemination of Bulk Content. Reza Sherafat Hans-Arno Jacobsen University of Toronto ICDCS 2012 – Macau. http:// msrg.org / project / publiy. The Publish/Subscribe Model.

florencec
Download Presentation

Publiy + : A Peer-Assisted Publish/Subscribe Service for Timely Dissemination of Bulk Content

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Publiy+: A Peer-Assisted Publish/Subscribe Service for Timely Dissemination ofBulk Content Reza Sherafat Hans-Arno Jacobsen University of Toronto ICDCS 2012 – Macau http://msrg.org/project/publiy

  2. The Publish/Subscribe Model • Asynchronous event-driven messaging is widely used in building distributed systems • Sensor networks, e.g., traffic monitoring • Notification systems, e.g., distribution of news, social networks • Other applications, e.g., financial systems, online games • Events (publications) are small messages • A “change in state” of world objects but not entire object “state” itself • Event message size is in range of few bytes to 10s/100s of KBs • Pub/Sub allows event consumers to specify their interests using subscriptions and receive related events asynchronously as they are produced • Fast, near real-time delivery • Selective delivery: subscription matching semantics • Scalability aspects investigated: number of subscribers/subscriptions and publications ICDCS 2012

  3. Another Dimension ofScalabilityfor Pub/Sub Videofiles Pictures • Content size of hundreds of MBs • Many application scenarios involving large content can take advantage of reactive pub/sub model • Traditionally content-dissemination is a receiver-initiated process • Content Dissemination Networks (CDN): Costly, requires provisioning • P2P file-sharing applications: Slow, potentially inefficient Software File synch. Socialnetworks P2P file sharing ICDCS 2012 Distribution of software updates

  4. Publiy+in a Nutshell • Publiy is a Java-based pub/sub system developed at the University of Toronto: Supports conventional reliable and multi-path event forwarding • Publiy+ brings the benefits of event distribution to the world of content dissemination • Design goals: Selective delivery, timely delivery, and system scalability w.r.t. publication size • Based on a peer-assisted architecture to improve scalability, and lower maintenance costs • Elements of the system are deployed as part of the infrastructure but the majority of the effort is contributed by subscribers themselves • Other peer-assisted systems already deployed for music and video streaming, e.g., Spotify, Skype, etc ICDCS 2012

  5. Software Patch Distribution:A Sample Scenario • End-user: “I want to get software updates for my browser” • Polling is one option • Periodically query for updates • If updates are available, start to download; otherwise, try later • Prone to flash crowd scenarios • Pub/Sub is an alternative • Clients register subscriptions: name=Firefox; version=3.6; OS=MacOSX • When an update is released, all interested clients download it: reactive delivery ICDCS 2012

  6. Hybrid Architecture Metadata information Pub/Sub Broker Control layer Data layer Subscribe Subscribe Subscribe Subscribe Subscribe Subscribe Region Data messages Clients (publisher/subscriber) ICDCS 2012

  7. Control Layer Home broker {X} {Z,Y} Descriptor {A,B} {A,B,X} Descriptor X Z Publish Y Descriptor Descriptor A S Subscriber can also contribute B ICDCS 2012

  8. Data Layer Content Content Descriptor Linear coding Descriptor Segment i Segment i Segment i Content NetworkTransfer Publisher Block 1 Decode Decode Block 2 …. Coded blocks Block … Block k {A,B,…} Segmentation ICDCS 2012

  9. Advantages of Network Coding Streamlines dissemination of blocks • Block sizes are small (10KB in Publiy+) • Clients can start to contribute as early has having received 1 coded block • Management and scheduling of blocks are simplified:Without network coding overhead is substantial • Clients receiving a segment can receive blocks from “any”other node that has “some” of the blocks • Coded blocks are equally useful Segment 1 Segment 1 Network transfer ??? ??? ??? ICDCS 2012

  10. Dissemination Strategy toCombat Flash Crowds • Flash crowds can prolong the dissemination time • Traditional client/server designs are easily overwhelmed:More and more servers needed to handle traffic surge which is costly • Studies show that even P2P BitTorrent file sharing faces problems [Bharambe2006]: Some blocks of file become rare and delay download completion times • Reactive delivery using pub/sub is anultimate flash crowd scenario: All subscribers are already present in the system • Coordination done by brokers helps deal with flash crowd scenarios ICDCS 2012

  11. Segments Dissemination Strategy Effective utilization of source’s bandwidth via delegation • First, upload segments to a small number of peers (from all regions) in PushList • Peers also receive similar PushLists and concurrently code/send blocks they receive to each other • Once a segment is served by source, all peers have the entire segment • Peers will be responsible to transfer segments to other nodes: This frees up bandwidth at source • Peers continue to send coded blocks within their region Source ICDCS 2012 Cluster of initial receivers

  12. Evaluations • Platform: SciNet HPC computing cluster at University of Torontohttp://www.scinet.utoronto.ca/ • Each node (broker, source, or subscriber) is deployed on a separate CPU core • 2.66 GHz CPUsand Gigabit Ethernet • Uplink bandwidth is throttled (100-200 KB/s) • In allexperiments, system parametersare as follows:Number of blocks per segmentis 100 and blocksizeis10 KB:Segment size of 1 MB • Experimental setup • 1-5 Regions • 120, 300 or 1000 subscribers uniformly distributed among regions ICDCS 2012

  13. Scalability w.r.t. Number of Subscribers Network setup: 300 and 1000 subscribers 1 source publishing 100 MB of content ICDCS 2012

  14. Contribution of the source Contribution of subscribers Contribution of Peers Avg blocks transferred per segment: 136 blocks Avg uploaded blocks per subscriber: 102,000 coded blocks Network setup: • 1000 subscribers • 10 source publishes 100 MB of content (1GB in aggregate): totally 100,000 blocks are published ICDCS 2012

  15. Within 1300 s download ends Upon release all clients start download Comparison With BitTorrent Experiment setup: 120 subscribers (capped uplink bandwidth at 200 KB/s) 1 source publishes 100 MB of content ICDCS 2012

  16. [BT]: Within 1700 s downloads end [BT]: Polling intervalof 10 minutes Comparison With BitTorrent Experiment setup: 120 clients (capped uplink bandwidth at 200 KB/s) 1 source publishes 100 MB of content ICDCS 2012

  17. [BT]: Within 1600 s downloads end Polling intervalof 2 seconds Comparison With BitTorrent Experiment setup: 120 clients (capped uplink bandwidth at 200 KB/s) 1 source publishes 100 MB of content ICDCS 2012

  18. Conclusions • Selective and reactive dissemination using the pub/sub-style model is applicable to many application scenarios involving bulk content • Publiy+ enables scalable and timely dissemination of large published content using a hybrid coordinated peer-assisted architecture • Avoids high cost and performance bottlenecks of dedicated server farms, e.g., CDN • Overcomes the deficiencies of pure P2P systems, e.g., BitTorrent • Experimental evaluation results confirm scalability of the approach and advantages of using network coding techniques ICDCS 2012

  19. Thank you! ICDCS 2012

  20. 1 TB of data Medium popularity Most popular Least popular Traffic Sharing Among Competing Contentwith Different Popularity Experiment setup: 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s) 15 sources (3 in each region) publish 100 MB Content has 1x, 2x, and 3x popularity ICDCS 2012

  21. Traffic Sharing Among Competing Contentwith Uniform Popularity Experiment setup: 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s) 15 sources (3 in each region) publish 100 MB with uniform popularity ICDCS 2012

  22. Content Serving Policy Network setup: 300 clients 1 source publishes 100 MB of content

  23. Content Serving Policy Network setup: 300 clients 1 source publishes 100 MB of content

  24. Impact of Packet Loss Network setup: 300 clients 1 source publishes 100 MB of content

  25. Impact of source Fanout on dissemination time Network setup: 300 clients 1 source publishes 100 MB of content

  26. Cross-regional traffic Regional traffic Effectiveness of Traffic Shaping Experiment setup: 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s) 1 sources publish 100 MB

More Related