1 / 12

The Structure of Scientific Collaboration Networks by M. E. J. Newman

The Structure of Scientific Collaboration Networks by M. E. J. Newman. CMSC 601 Paper Summary Marie desJardins January 27, 2009. Outline. Overview Social networks Scientific collaboration networks Properties Data sets Results Conclusions. Overview.

tacy
Download Presentation

The Structure of Scientific Collaboration Networks by M. E. J. Newman

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Structure of Scientific Collaboration Networksby M. E. J. Newman CMSC 601 Paper Summary Marie desJardins January 27, 2009

  2. Outline • Overview • Social networks • Scientific collaboration networks • Properties • Data sets • Results • Conclusions

  3. Overview • Computationally analyze scientific collaboration networks • Uses actual data sets from online archives • Findings: • small-world property • presence of “clustering” • power law distribution of #collaborators, #papers • different patterns in different fields

  4. Social Networks • Idea: Represent acquaintanceship relationships between individuals • Measure graph-theoretic properties • Widely studied in social science Penny David Marie Sergei Lise Peter

  5. Penny David Marie Sergei Lise Peter Properties of Social Networks • Degree (# edges) • z(Marie) = 4 • z = 3 • Degree distribution = [2, 2, 3, 3, 4, 4] • Clustering • C = probability (ij | ik, jk) = 12/20 = .6 • Degree of separation (path length) • average = 1.47 • random graph  log N / log z (typically 6)

  6. Scientific Collaboration Networks • Represent co-authorship relationships • Data sets: • Biomedical research (MEDLINE) • Theoretical physics (Los Alamos e-Print Archive (arxiv)) • High-energy physics (SPIRES) • Computer science (NCSTRL) • Papers from 1995-1999 • 13K – 2M papers

  7. Erdös Number • Paul Erdös • Famous Hungarian mathematician • Published over 1400 papers! • Erdös Number = co-authorship distance to Erdös • Marie’s Erdös Number = ??

  8. Counting Authors • Ambiguity in names (first name vs. first initial vs. all initials) • Two counts: all initials vs. 1st initial •  Upper/lower bounds on number of authors

  9. General Properties • Average number of papers per author: 4 • Average number of authors per paper: 3 • Max: 1681!! (SPIRES) • Average number of collaborators: • Ranges from 4 (high-energy theory) to 173 (SPIRES) • Size of largest connected component: • Ranges from 60% (CS) to 90% (astrophysics) • Amount of clustering: • Ranges from 7% (MEDLINE) to 73% (SPIRES)

  10. Degree Distribution • Earlier work showed power law distribution of degree (would be straight line) • Here we see a power law distribution with an exponential cutoff • Conjecture: result of limited time window, and limited publication life of scientists

  11. Degrees of Separation • Average degree of separation  6 • “Small world” property – comparable to distance in random graph • Diameter (max distance) typically around 20 • (for largest connected component)

  12. Summary • Scientific collaboration networks • Social networks exhibiting interesting structure • Lots of available data • Key characteristics • High clustering • Small-world property • Power-law distribution of #authors, #papers • Properties vary across fields

More Related