1 / 28

A Measurement-driven Analysis of Information Propagation in the Flickr Social Network

A Measurement-driven Analysis of Information Propagation in the Flickr Social Network. WWW09 报告人: 徐波. Flickr. C ontribution. we collect and analyze large-scale traces of information dissemination in the Flickr social network We analyzed the data to answer three key questions:

meryl
Download Presentation

A Measurement-driven Analysis of Information Propagation in the Flickr Social Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Measurement-driven Analysis of Information Propagation in the Flickr Social Network WWW09 报告人: 徐波

  2. Flickr

  3. Contribution • we collect and analyze large-scale traces of information dissemination in the Flickr social network • We analyzed the data to answer three key questions: • howwidely does information propagate in the social network? • howquickly does information propagate? • what is the role ofword-of-mouth exchanges between friends in the overall propagationof information in the network?

  4. Contrary to viral marketing "intuition" • even popular photos do not spreadwidely throughout the network • even popular photos spreadslowly through the network • information exchanged betweenfriends is likely to account for over 50% of all favoritemarkings,but with a significant delay at each hop.

  5. Dataset • Method • We startedwith a randomly selected Flickr user and followed all of the friendslinks in the forward direction in a breadth first search fashion • Dynamics • We crawled the Flickr social network graph once per day for the periodof 104 consecutive days from November 2–December 3, 2006and February 3–May 18, 2007 • Size • We observed 2.5 million Flickrusers and 33 million links, an estimated 25% of the entire Flickrnetwork.

  6. Social network topology • Outdegree • Indegree • Relationship • In Flickr, most links are reciprocal; 68%of the links are bidirectional. • Pearson correlation0.76

  7. Social network topology(2) • Path length • The maximumpath length between any two nodes in the network (i.e., diameter)is 27, while the average path length is 5.67. • Clustering coefficient

  8. Picture popularity • Thenumber of views is not strongly correlated with the number of comments(0.13) or the number of fans (0.23). On the other hand, thenumber of comments pictures receive is highly correlated with thenumber of fans (0.60).

  9. Question 1 • Question • How widely does information spread in the Flickr social network?Do popular pictures gather fans from different parts ofthe network or is their popularity limited to a certain region? • Measure • Global popularity or Local popularity • Distribution of fans

  10. Local versus global picture popularity • Assume • We assume that if pictures spread widelythroughout the network then we will see a good match between thelocal and global hotlists of pictures. • Method • For the test, we randomly picked 250 users (or seed nodes) fromthe set of 2.5 million users who have favorite-marked at least onephoto, and identified the top 100 pictures from the neighborhood ofeach seed node. We visited the 4-hop neighborhood around each ofthese seed nodes, based on the final snapshot of the network.

  11. Local versus global picture popularity(2)

  12. Distance from fans to picture uploaders • Motivate • High content locality • Results

  13. Distance from fans to picture uploaders(2)

  14. Question 2 • Question • How quickly does information spread through the social network? • Measure • Patterns of growth • Dominant patterns

  15. Patterns of popularity growth

  16. Long-term trends in popularity growth • Question • How does photo popularity evolve over a long period of time? • Which growth pattern is dominant in a time period of a year orlonger?

  17. Question 3 • Question • what is the role ofword-of-mouth exchanges between friends in the overall propagationof information in the network? • How long after the upload of a photo do fans mark itas a favorite? • Measure • The role of social network

  18. Dissemination mechanisms • In Flickr, people can find pictures through various mechanisms • Featuring • Search results • Links between content • External links • Social network • We focus on the dissemination of content via social network links in Flickr

  19. Identifying social cascades • Social cascade • Information can travel widely through a social network one-hop at atime via word-of-mouth exchanges between friends in the network • Method • We say that user A found a photo P through the social network if and only if there exists a user B who is a friend of A such that: • B also marked P as a favorite, • B included photo P on his favorite list before A included photo P on his favorite list, and • B was a friend of A before A made photo P his favorite.

  20. The role of social cascades

  21. The role of social cascades(2) • Individual pictures vary from this pattern

  22. Peer pressure in photo favorite marking

  23. Time taken for social cascade hops • 35% a week • 50% 60 days • Average delay • 140 days

  24. DISCUSSION • key observations • most fansof a given picture are within a few hops of the picture uploader • pictures spread slowly throughout the social network • Explanation • Tow possible explanations of high content locality • One potential explanation of delay in the social cascade

  25. models of viral marketing • Introduction • It starts with“seeds”of individuals who spread information by infecting theirfriends, in a similar fashion to the spread of an infectious disease.The expected number of new infectious generated by each infectedperson is called the reproduction rate or R. If R > 1, each personis infecting more than one additional person and the number ofinfected people will grow exponentially, i.e., viral marketing is asuccess. When R < 1, initial seeds will quickly burn themselvesout after several steps of information spreading.

  26. homophily in social networks • Introduction • people who like each other’s pictures tend to become friends andpeople who are friends tend to like each other’s pictures, therebyensuring that popularity of pictures is localized, even for top popularpictures • Experiment • From a random selection of 150,000 newlinks, we found that 27,546 or 18% of the links were formed afterfavorite marking the others’picturesin 83% of the cases, A waspreviously only 2-hops away from B.

  27. One potential explanation of delay in the social cascade • In Flickr, users get a small number ofupdates about their friends’ newly uploaded pictures when they login. So the rate of information propagation may be limited by thefrequency of user logins

More Related