80 likes | 181 Views
Decoding the Social Media Genome. SORIN ADAM MATEI – COM David Braun - ENVISION Seungyoon Lee - COM Lorraine Kisselburgh - COM Brian Britt – COM Collaborators March Smith (Connected Action / Microsoft Research / NodeXL ) Horia Petrache , Physics, IUPUI WikiTrust , UC Santa Cruz.
E N D
Decoding the Social Media Genome SORIN ADAM MATEI– COM David Braun - ENVISION Seungyoon Lee - COM Lorraine Kisselburgh - COM Brian Britt – COM Collaborators March Smith (Connected Action / Microsoft Research / NodeXL) HoriaPetrache, Physics, IUPUI WikiTrust, UC Santa Cruz
A research opportunity for computational social science projects • Wikipedia dataset • 2001 – 2008 editorial histories • 17 mil articles • Over 280 mil edits • Over 20 milunique editors
Social Network representation of the data • Nodes: editors (contributors) • Edges (links): co-editorial contributions • If two editors contributed to the same article, they have a linkage • Gravitational model – more words, stronger links, the longer the time between interventions the weaker the links • Expected size • Billions of edges • >1 TB of data
Why the “social media genome”? • Does the network have a number of “bases,” (ie, subnetwork structures), similar to DNA • Do these bases recombine to create “genes”, (ie, functional structures)? • Are genes organized into larger aggregates? • How can you tell where a “gene” starts and one ends? • How can we tell what functions specific “genes” have?
Research agenda • Can the complexity of a gigantic network be explained by a relatively simple “network alphabet”? (DNA-like “structure,” “chromosomes”) • What are the smallest structural units of this network? (bases, ATGC) • Are they limited to a limited array of topologies? • Are these topologies associated with specific roles? • Do the roles/topologies combine and recombine at various scales to create more complex structures? (genes) • In brief, what is the: • Vocabulary • Grammar • Syntax
Using entropy to measure complexity of the social media genome • As networks get more complex, do they decrease the entropy of the social system? • If yes, by how much? • What is the overtime evolution of complexity/entropy of the social network described by Wikipedia?
Wikipedia has become more top heavy • As Wikipedia has added more and more users (13 mil, at the last count), its core group (<100,000) accounts (percentage-wise) for most of the total work • Wikipedia is not an expression of wise crowds, but of a clear and well oiled ad-hoc bureaucracy – ad-hocracy
ln(x) Intervention entropy Intervention number (events) As articles are edited more, the proportion of people who account for most of the work decreases Orange line – entropy (uneveness) of actual contributions Dotted line – wisdom of crowds ceiling, all contributions would be equal