1 / 35

Curing D iscontent in Online Content Acquisition

Curing D iscontent in Online Content Acquisition. Nishanth Sastry King’s College London. http:// www.watfordobserver.co.uk /nostalgia/memories/10099510.Coronation_treat_as_community_gathers_around_the_only_TV/. Early use of mass media.

umika
Download Presentation

Curing D iscontent in Online Content Acquisition

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Curing Discontent in Online Content Acquisition Nishanth Sastry King’s College London

  2. http://www.watfordobserver.co.uk/nostalgia/memories/10099510.Coronation_treat_as_community_gathers_around_the_only_TV/http://www.watfordobserver.co.uk/nostalgia/memories/10099510.Coronation_treat_as_community_gathers_around_the_only_TV/ Early use of mass media Picture from the TV broadcast of the Coronation of Elizabeth II in 1953, Watford

  3. Today’s “TV” viewing With Digital Media Convergence, TV is just another video app, accessed on-demand on the Web

  4. What changed: Push Pull Generalizes to other mass media as well • Superficially: audience to TV set ratio has decreased • At a fundamental level: • audience per “broadcast” is lower • “Broadcast” time is chosen by the consumer • Traditional mass media pushedcontent to consumer • Current dominant model has changed to pull

  5. Implications of the pull model • Traditionally, “editors” decided what content got pushed when • Linear TV schedulers use complex analytics to decide “primetime” • Users get more choice with the pull model • When to consume • What to consume (from large catalogue) • Unpopular/niche interest content also gets a distribution channel, not just what editors decide to showcase/bless as “publishable” • Cheaper to stream over the Web to a single user than to broadcast (e.g. to operate/maintain equipment like high power TV transmitters) • BUT: Cost of broadcast can be amortized across millions of consumers • Could be cheaper per user to broadcast than to stream

  6. Research questions WWW’13 ICWSM’12 ICWSM’13 ASE/IEEE Social Informatics’12 • How does pull model impact delivery infrastructure? • Can additional load of on-demand pulls be reduced by reusing scheduled pushes? • How do users make use of flexibility afforded to them? • Were/are editors good at predicting popularity? • Is niche interest/unpopular content important to users? • How do users find unpopular content they like? • Users help each other! • Understanding how and why users share their loves • Designing infrastructure to help users find most influential users for their topics of interest

  7. *Certain data can be made available upon request Data to answer the questions* WWW’13 ICWSM’12 ICWSM’13 ASE/IEEE Social Informatics’12 • Nearly 6 million users of BBC iPlayer across the UK • 32.6 million streams, >37K distinct content items • 25% sample of BBC iPlayer access over 2 months • Five years of vimeo data (Feb’05 – Mar’10) • Goes back to within 3 months of founding date • 443K videos, 2.5 million likes, 200K users, 700K links • All content curation activity, Jan’13Pinterest (8.5 million users), Dec’12last.fm (nearly 300K users) • All tweets leading up to London Olympics (1.2 million), Closing Ceremony (~0.5 million), London Fashion Week (168K tweets)

  8. WWW’13 Understanding and decreasing the network footprint of Catch-up TV How does pull model impact delivery infrastructure? Can additional load of on-demand pulls be reduced by reusing scheduled pushes? How do users make use of flexibility afforded to them? Were/are editors good at predicting popularity?

  9. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 • BBC proposes, consumer disposes! • Serials:~50% of content corpus; 80% of watched content! What users prefer to watch-I

  10. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 What users prefer to watch-II

  11. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 High preference for 30 and 60 min shows Abandoned What users prefer to watch-III

  12. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 On-demand spreads load over time Linear TV schedulers seem to do a good job of predicting popularity! Impact of pull on infrastructure

  13. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 • BUT: iPlayer traffic is close to 6% of UK peak traffic • Second only to YouTube in traffic footprint • Compare to adult video, a traditional heavy hitter. Most popular adult video streaming sites have <0.2% traffic share • BUT: amortized per-user, broadcast greener than streaming* (using Baligaet al.’s energy model for the Internet) *All channels except BBC Parliament, which has few viewers On-demand more suited to web/pull than linear TV Still, can we decrease its footprint, please?

  14. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 • DVRs have >50% penetration in US, UK • Many (e.g. YouView) don’t need cable • Could also use TV tuner and record on laptop Yes, we can! But, people don’t remember to record always

  15. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 Speculative Content Offloading and Recording Engine Can we help users record what they want to watch?

  16. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 • Predict using user affinity for • Episodes of same programme • Favourite genres • We can optimise for decreasing traffic or carbon footprint • Decreasing carbon decreases traffic, but not vice versa • Turns out we only take 5-15% hit by focusing on carbon SCORE=predictor+optimiser

  17. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 • SCORE saves ~40-60% of savings achieved by oracle • Green optimisation saves 40% more energy at expense of 5% more traffic • Oracle saves: • Up to 97% of traffic • Up to 74% of energy Performance evaluation Compare SCORE relative to Oracle knowing future requests

  18. Understanding and decreasing the Network Footprint of Catch-up TV-WWW’13 • Indiscriminately recording top n shows can lead to negative energy savings! • Personalised approach necessary, despite popularity of “prime time” content Not all of these savings come from predicting popular content

  19. AAAI ICWSM’12 How To Tell Head From Tail in User-generated Content Corpora WWW’13 ICWSM’12 • Is niche interest/unpopular content important to users? • How do users find unpopular content they like? • Users help each other!

  20. How to tell head from tail in User-generated Content Corpora- AAAI ICWSM’12 The tail is heavy in users, not accesses

  21. How to tell head from tail in User-generated Content Corpora- AAAI ICWSM’12 Like sets of many users are dense in tail items

  22. How to tell head from tail in User-generated Content Corpora- AAAI ICWSM’12 Likers of tail content are geographically more diverse Niche interest content rather than merely unpopular?

  23. How to tell head from tail in User-generated Content Corpora- AAAI ICWSM’12 How do users find tail items? Non-viral access predominates in popular items

  24. AAAI ICWSM’13 Sharing the Loves: Understanding the how and why of online content curation WWW’13 • Is niche interest/unpopular content important to users? • How do users find unpopular content they like? • Users help each other! ICWSM’12 ICWSM’13 Understanding how and why users share their loves Designing infrastructure to help users find most influential users for their topics of interest

  25. AAAI ICWSM’13 Sharing the Loves: Understanding the how and why of online content curation • Data reminder: • All (38 million) Repins, (~20 million) Likes on Pinterest Jan 13 • All (90 million) Loves, (~60 million) Tags on last.fm Dec 12 • Survey respondents: 30 for Pinterest, 270 for last.fm

  26. Sharing the loves: Understanding the how and why of online content curation- AAAI ICWSM’13 Pinterest Last.fm Why people curate content Curation comes up when search stops working – Clay Shirky

  27. Sharing the loves: Understanding the how and why of online content curation- AAAI ICWSM’13 • Pinterest: (30 respondents, allow multiple answers) • 85% use it as a personal collection or scrapbook • 48% uses the site to display their content to others • Last.fm: (279 respondents, allow multiple answers) • 39% tags tracks for personal classification • 39% tags to create a global classification (genres). • The majority of respondents shared this view (last.fm): • “I find the social aspect more useful and interesting with people I know, rather than developing new interactions based on music taste.” • BUT: one couple met on last.fm, started going to gigs together and are now happily married!! Curation: of personal or social value? Users mostly see it as personal effort, with exceptions

  28. Sharing the loves: Understanding the how and why of online content curation- AAAI ICWSM’13 Despite unsynchronised personal effort, community synchronises on some topics! Strong popularity skew, as in previous highlighting methods

  29. Sharing the loves: Understanding the how and why of online content curation- AAAI ICWSM’13 • Unstructured curation: Actions that simply highlight an item • e.g., love, like, ban, comment, shout • Structured Curation: Actions that also organise item onto user-specific lists • e.g., pinning an item onto a user’s board, • attaching a user’s tag to a track • Characteristics of effective curators: consistency, diversity… Understanding how effective content curation happens

  30. Sharing the loves: Understanding the how and why of online content curation- AAAI ICWSM’13 Structured curation preferred for popularly curated items

  31. Sharing the loves: Understanding the how and why of online content curation- AAAI ICWSM’13 The most important part of a curator’s job is to continually identify new content for their audience -- RohitBhargava How to curate: Consistent and regular updates attracts followers

  32. Sharing the loves: Understanding the how and why of online content curation- AAAI ICWSM’13 How to curate: Diversity of interests attracts followers

  33. IARank: Ranking Users on Twitter in Near Real-time, Based on their Information Amplification Potential ICWSM’13 ASE/IEEE Social Informatics’12 • Effective content curation is a highly demanding task • Consumers still need to find the best “editors” they want • Naturally self-limiting when it comes to high-volume events • Olympics closing ceremony: 400K tweets in just over 3 hours • We can rank the most influential users e.g., PageRank • PageRank takes time to converge  Ranks can change before! • IARank:ranks users by Information Amplification potential • “Buzz” factor: how likely to be retweeted • “structural advantage”: how good is your immediate neighbourhood • Understanding how and why users share their loves • Designing infrastructure to help users find most influential users for their topics of interest

  34. Summary WWW’13 ICWSM’12 ICWSM’13 ASE/IEEE Social Informatics’12 • Characterisingon-demand content consumption via 6 million users of BBC iPlayer • If broadcast is efficient, we should find ways to use it! • SCORE: personalised content offloading engine • Is niche interest/unpopular content important to users? • How do users find unpopular content they like? • Users help each other! • Social curation complements search; effective curators are consistent and have diverse interests • Near-instantaneous reranking scheme for high volume content sharing systems like Twitter

  35. Curing Discontent in Online Content Acquisition Nishanth Sastry King’s College London http://www.inf.kcl.ac.uk/staff/nrs

More Related