1 / 26

Ch 5 + Anatomy of the Long Tail ( Goel et al., WSDM 2010)

Padmini Srinivasan Computer Science Department Department of Management Sciences http:// cs.uiowa.edu / ~ psriniva padmini-srinivasan@uiowa.edu. Ch 5 + Anatomy of the Long Tail ( Goel et al., WSDM 2010). Compression (Ch 5). Heaps. Zipf’s law. Broder et al. Graph Structure of the Web.

gerda
Download Presentation

Ch 5 + Anatomy of the Long Tail ( Goel et al., WSDM 2010)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Padmini Srinivasan Computer Science Department Department of Management Sciences http://cs.uiowa.edu/~psriniva padmini-srinivasan@uiowa.edu Ch 5 + Anatomy of the Long Tail (Goel et al., WSDM 2010)

  2. Compression (Ch 5)

  3. Heaps

  4. Zipf’s law

  5. Broder et al. Graph Structure of the Web • Note that the exponent is different. Note also the deviation • In the low end of the out-degree. • Probability page has in-degree k = 1/k2 • Actual exponent slightly larger than 2.

  6. Infinite-inventory retailers • Amazon, Netflix, iTunes music store, • Long tail markets • Items not in brick and mortar stores: • 30% Amazon.com sales • 25% Netflix • Success because of long tail markets. • Two different hypotheses • Majority prefer popular and minority prefer niche items • Everyone likes some popular and some niche items • Different impact on inventory control. If keeping mainstream items: • Satisfy most people nearly all the time • Irritate most people at least some of the time • Knowing which model works/fits/explains behaviour better is important

  7. Infinite-inventory retailers • Two different hypotheses • Majority prefer popular and minority prefer niche items • Everyone likes some popular and some niche items • Different impact on inventory control. If keeping mainstream items: • Satisfy most people nearly all the time • Irritate most people at least some of the time • Knowing which model works better is important • Their work supports the second hypothesis. • Also availability of tail items may boost sale of ‘head’ items ~ one-stop shopping convenience • Not just the direct impact on revenue: second-order gains: customer satisfaction.

  8. Datasets examined Web queries: stemming Urls: restricted to domains (click search data) Browsing: Nielsen data (domains) Data trimming done

  9. Long Tail • What is it? • A relatively small number of items accounts for large number of consumptions – old 80 – 20 rule. • Definition: popularity: fraction of total consumption fulfilled by an item. Eg. fraction of checkouts associated with a particular book. • Popularity of a movie: total times rated/total number of ratings

  10. Two Long-Tail GraphsNetflix & Yahoo! Music Typical inventory: 3000 (netflix) 50,000 (Yahoo! Music) Web search: 10 web sites > over 15% page views Top 10,000 web sites leaves 20% unaccounted.

  11. More Long Tail Graphs

  12. Eccentric Tastes? • An inventory: k-ranked (most popular) items • Definition User is p-percent satisfied if at least p percent of consumption is in the k-ranked set. • Analysis: What percent of users are p-percent satisfied? • Netflix (k = 3000) only 11% of users are 100% satisfied; 63% are 90% satisfied • Yahoo! Music (k=50,000), only 5% users 100% satisfied; 32% are 90% satisfied • With brick and mortar almost none of the users completely satisfied.

  13. Eccentric Tastes?Netflix & Yahoo! music Upper: 90% satisfaction; lower: 100 % satisfaction

  14. Ratings versus Popularity • The more obscure the less appreciated an item. • So the more aware the more appreciated? • Studied with movies and music. • Relationship between popularity (rank) and rating • Value of tail over emphasized because there is disproportionate dissatisfaction or satisfaction. • Tail end less dissatisfaction/satisfaction?

  15. Ratings versus Popularity • Pattern present Netflix but not in music dataset. (more obscure songs get even higher ratings).

  16. Ratings versus Popularity Tail end less dissatisfaction/satisfaction? (users disproportionately dissatisfied with tail end) 32% Netflix users, 56% of Yahoo! Music users had at least 10% items rated high in the tail 85% netflix users and 91% yahoo! Music users rated an item outside physical stores. (original 89% & 95% resp.) So can’t dismiss the long tail ends Even typical users have a need for tail end items

  17. Null Hypothesis model • Random model • Each user decides how many items to consume (consistent with the empirical data. Fix number of users, number of items, and number selected/viewed/clicked/rated by users). • Item selection by user also random but constrained to be according to popularity and without replacement. • What are the limitations in this null model?

  18. Null ModelNetflix & Yahoo! music Upper: 90% satisfaction; lower: 100 % satisfaction Null models: users are much harder to satisfy. Eg: only 14% of users in null model are 90% satisfied compared to 64% (movies) with k=3000.

  19. Implications? • Though most users consume tail content part of the time • Sizeable fraction of users prefer head over tail content that goes beyond the draw of popularity. • To compensate other users draw disproportionately from the tail.

  20. Consumption patterns: Users vs Popularity

  21. Some patterns • By moving from k = 3000 to 3500 movies, cumulative popularity increases 2% from 87 to 89% while 90% satisfaction increases more (7%) (63 to 70%). • Movies that by popularity alone account for only 2% of the demand could potentially grow the overall customer base by 7% by attracting newly satisfied users. • Searching: moving 95 to 96% along the tail increases 90% user satisfaction from 80 to 86%

  22. Individual eccentricity: median rank of his/her consumed items.

  23. More on eccentricity • Are those who are more ‘engaged’ (i.e., consume more) more eccentric? • No: correlations between two at individual level (low) • But some observations at the group level

  24. More on eccentricity ~ web pages Unique urls

  25. Theoretical Analysis • Independent model • Sticky model • Winner take all. • Shared inventory approach

  26. Summary • Nice analysis long tail • Different perspectives combined • Popularity (cumulative and individual) • 90% , 100% satisfaction • Engagement versus ratings • Use of a null model to make predictions and compare • Nice graphs • Long tail helps in capturing user satisfaction and retention

More Related