1 / 15

Sorin Adam Matei – Communication David Braun – ITaP Research/Envision

Visible Symbiosis: Leveraging the Purdue cyberinfrastructure for studying large scale knowledge communities . Sorin Adam Matei – Communication David Braun – ITaP Research/Envision Seungyoon Lee - Communication Lorraine Kisselburgh – Communication Brian Britt – Communication.

rigg
Download Presentation

Sorin Adam Matei – Communication David Braun – ITaP Research/Envision

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visible Symbiosis:Leveraging the Purdue cyberinfrastructure for studying large scale knowledge communities Sorin Adam Matei – Communication David Braun – ITaP Research/Envision Seungyoon Lee - Communication Lorraine Kisselburgh – Communication Brian Britt – Communication

  2. A research opportunity for computational social science projects • Wikipedia dataset • 2001 – 2008 editorial histories • 17 mil articles and over 280 mil edits • Over 20 mil unique editors • Wikipedia can help us study: • Knowledge emergence and creation • Collaboration dynamics • The genomics of social media

  3. Decoding the Social Media Genome SORIN ADAM MATEI– COM David Braun - ENVISION Seungyoon Lee - COM Lorraine Kisselburgh - COM Brian Britt – COM Collaborators March Smith (Connected Action / Microsoft Research / NodeXL) HoriaPetrache, Physics, IUPUI WikiTrust, UC Santa Cruz

  4. Parsimony? • Can the bewildering complexity of social media datasets be described parsimoniously? Editorial network for the Neutral Point of view Talk Page (2001-2008)

  5. Representing Wikipedia as a network • Nodes: editors (contributors) • Edges (links): co-editorial contributions • If two editors contributed to the same article, they have a linkage • Gravitational model – more words, stronger links, the longer the time between interventions the weaker the links • Expected size • Billions of edges • >1 TB of data

  6. Why the “social media genome”? • Does the network have a number of “bases,” (ie, subnetwork structures), similar to DNA? • Do these bases recombine to create “genes”, (ie, functional structures)? • Are genes organized into larger aggregates? • How can you tell where a “gene” starts and one ends? • How can we tell what functions specific “genes” have?

  7. Research agenda • Can the complexity of a gigantic network be explained by a relatively simple “network alphabet”? (DNA-like “structure,” “chromosomes”) • What are the smallest structural units of this network? (“bases” similar to ATGC) • Are they limited to a limited array of topologies? • Are these topologies associated with specific roles? • Do the roles/topologies combine and recombine at various scales to create more complex structures? (“genes”)

  8. Using entropy to measure complexity of the social media genome • As networks get more complex, do they decrease the entropy of the social system? • Entropy: system evenness/ diversity/complexity • If yes, by how much? • What is the overtime evolution of complexity/entropy of the social network described by Wikipedia?

  9. Wikipedia has become more top heavy • As Wikipedia has added more and more users (13 mil, at the last count), its core group (<100,000) accounts (percentage-wise) for most of the total work • Wikipedia is not an expression of wise crowds, but of a clear and well oiled ad-hoc bureaucracy – ad-hocracy

  10. ln(x) Intervention entropy Intervention number (events) As articles are edited more, the proportion of people who account for most of the work decreases Orange line – entropy (uneveness) of actual contributions Dotted line – wisdom of crowds ceiling, all contributions would be equal

  11. The emergence of knowledge in large scale networks Seungyoon Lee * Lorraine Kisselburgh * Sorin Adam Matei Department of Communication

  12. Research agenda • The dynamic processes of knowledge production as a large-scale collaborative effort are embedded within a social community • communicative relationships influence knowledge creation • Questions: • Is knowledge socially embedded? • How do knowledge networks emerge alongside social networks?

  13. Co-evolution of networks • Using network theory and large-scale data to extend understanding of how knowledge is created and produced by “crowds” • Few opportunities to analyze the processes of knowledge creation by large groups over long periods of time • Little research available to understand how social relationships influence the process of knowledge production

  14. Co-evolution of networks Knowledge networks Social/communication ntwks • Relationships b/w articles • Semantic ties among knowledge content and concepts • Relationships b/w contributors • Social ties among people Sample visualization What are the patterns of emerging relationships, and how are they embedded within each other?

  15. Implications • Network science enables us to analyze patterns of behavior in human, biological, and technological systems • The structures of knowledge collaboration • What factors influence how knowledge is produced in collaborative networks • Cyberinfrastructures provide support for processing and visualization of large scale data, allowing us to analyze both massive and longitudinal data systems to understand and model human behavior

More Related