1 / 11

A Labeled LDA Approach to the Dynamics of Collaboration

A Labeled LDA Approach to the Dynamics of Collaboration. Nikhil Johri CS 224N. Motivating Questions. What is the value added from academic collaboration? Division of labor? Mixture of individual contributions? New, synergistic ideas? Can we identify different collaboration styles?

isha
Download Presentation

A Labeled LDA Approach to the Dynamics of Collaboration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Labeled LDA Approach to the Dynamics of Collaboration Nikhil Johri CS 224N

  2. Motivating Questions • What is the value added from academic collaboration? • Division of labor? • Mixture of individual contributions? • New, synergistic ideas? • Can we identify different collaboration styles? • Synergy between established authors • Ideas from newer vs. older authors • Advisor + apprentices • What are the characteristics of influential collaborations and collaborators?

  3. Dataset • ACL (Association of Computational Linguistics) Corpus • 16,000+ papers • Ranges from 1965 to 2009 • Collaborations • 7,500+ papers with 2 or more authors

  4. Methodology • Labeled LDA • Cosine Similarities • Look for Significant Patterns

  5. Labeled LDA (Ramage et al.) • Variation of Latent Dirichlet Allocation (LDA) • Topics are constrained to be about specific tags associated with the documents • In this case, tags = authors • Result: a probabilistic term ‘signature’ for each author per year

  6. Methodology • Labeled LDA • Cosine Similarities • Look for Significant Patterns

  7. Cosine Similarity Author 1 Term-Signature Document Term-Vector = Similarity between author 1 and document = Similarity between author 2 and document Author 2 Term-Signature

  8. Methodology • Labeled LDA • Cosine Similarities • Look for Significant Patterns

  9. Sample Results • Average established author similarity score to papers • Break down by subfield • High similarity = more rigid, formal, requires training • Low similarity = more flexible, less defined, open to novelty High Similarity Scores Low Similarity Scores

  10. Sample Results • Identification of ‘hedgehogs’ and ‘foxes’ • Hedgehogs specialize in a single area • Foxes dabble in several areas Top ‘Fox’ Authors Top ‘Hedgehog’ Authors

  11. Conclusion • Suggested a system to determine author deviation from previous work on later papers • Tested the system on ACL collaborations • Presented preliminary results showing: • Hedgehog / fox style collaborators • Subfields that offer more flexibility for unestablished authors vs those that require more training • Stated a theory of collaboration styles and described how to use the system to identify these

More Related