1 / 8

Decoding the Social Media Genome

Decoding the Social Media Genome. SORIN ADAM MATEI – COM David Braun - ENVISION Seungyoon Lee - COM Lorraine Kisselburgh - COM Brian Britt – COM Collaborators March Smith (Connected Action / Microsoft Research / NodeXL ) Horia Petrache , Physics, IUPUI WikiTrust , UC Santa Cruz.

eman
Download Presentation

Decoding the Social Media Genome

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Decoding the Social Media Genome SORIN ADAM MATEI– COM David Braun - ENVISION Seungyoon Lee - COM Lorraine Kisselburgh - COM Brian Britt – COM Collaborators March Smith (Connected Action / Microsoft Research / NodeXL) HoriaPetrache, Physics, IUPUI WikiTrust, UC Santa Cruz

  2. A research opportunity for computational social science projects • Wikipedia dataset • 2001 – 2008 editorial histories • 17 mil articles • Over 280 mil edits • Over 20 milunique editors

  3. Social Network representation of the data • Nodes: editors (contributors) • Edges (links): co-editorial contributions • If two editors contributed to the same article, they have a linkage • Gravitational model – more words, stronger links, the longer the time between interventions the weaker the links • Expected size • Billions of edges • >1 TB of data

  4. Why the “social media genome”? • Does the network have a number of “bases,” (ie, subnetwork structures), similar to DNA • Do these bases recombine to create “genes”, (ie, functional structures)? • Are genes organized into larger aggregates? • How can you tell where a “gene” starts and one ends? • How can we tell what functions specific “genes” have?

  5. Research agenda • Can the complexity of a gigantic network be explained by a relatively simple “network alphabet”? (DNA-like “structure,” “chromosomes”) • What are the smallest structural units of this network? (bases, ATGC) • Are they limited to a limited array of topologies? • Are these topologies associated with specific roles? • Do the roles/topologies combine and recombine at various scales to create more complex structures? (genes) • In brief, what is the: • Vocabulary • Grammar • Syntax

  6. Using entropy to measure complexity of the social media genome • As networks get more complex, do they decrease the entropy of the social system? • If yes, by how much? • What is the overtime evolution of complexity/entropy of the social network described by Wikipedia?

  7. Wikipedia has become more top heavy • As Wikipedia has added more and more users (13 mil, at the last count), its core group (<100,000) accounts (percentage-wise) for most of the total work • Wikipedia is not an expression of wise crowds, but of a clear and well oiled ad-hoc bureaucracy – ad-hocracy

  8. ln(x) Intervention entropy Intervention number (events) As articles are edited more, the proportion of people who account for most of the work decreases Orange line – entropy (uneveness) of actual contributions Dotted line – wisdom of crowds ceiling, all contributions would be equal

More Related