150 likes | 266 Views
Topics in Scholarly Networks. Ying Ding Indiana University. Features of scholarly networks. Nodes: Journal, author, paper Edge has its types Co-author, cite, co-word Each node does have its topic vector: Author A=(t1, t2, t3, t6) Journal B=(t4, t6, t9, t10) Paper C=(t3, t2, t1, t9)
E N D
Topics in Scholarly Networks Ying Ding Indiana University
Features of scholarly networks • Nodes: Journal, author, paper • Edge has its types • Co-author, cite, co-word • Each node does have its topic vector: • Author A=(t1, t2, t3, t6) • Journal B=(t4, t6, t9, t10) • Paper C=(t3, t2, t1, t9) • These topical features are ignored
Adding topic features to the network analysis • We could see more: • How scholars collaborate at the topic level • How scholarly influence travels at the topic level • How to rank scholars at the topic level rather than in one whole field or one subject category
How to extract topics • There are many available algorithms to atomize such process
Topic Modeling • The ACT model calculated the probability distribution of author, journal, topic and document simultaneously: • P(t|a): research interest of the given author, • P(a|t): the most productive authors for the given topic; • P(t|j): the topic focus of the given journal, • P(j|t): the most productive journals for the given topic; • P(t|d): the topic distribution of the given document, • P(d|t): the most related papers to the given topic; • P(t|w): the probability distribution of the given word for each extracted topic, • P(w|t): the most related words to the given topic.
What can we do with topics? • Ranking scholars by considering topics: topic-based PageRank • Subgraph mining to unveil the nuance of scientific collaboration or endorsement • Overlaying topics and communities
Diversity Subgraph Mining • Diverse subgraph between Jiawei Han and James Hendler • It sketches the various interdisciplinary collaborations between Jiawei Han and James Hendler, including data mining, machine learning, semantic web, ontology engineering, etc.
Topics in scholarly evaluation Multi dimensions (e.g., heterogeneous scholarly networks, subgraph mining, diversity path ranking) Vertical Two dimensions (e.g., h-index, overlaying topics with community) One dimension (e.g., citation rank) Prediction (e.g., time, topic, social media) Horizontal
Challenges • Network becomes complex: dynamic, semantic (nodes and edges), large-scale • Scholarly communication becomes diverse: traditional (co-author, citation), nontraditional (downloading, following (twitter), reading (Mendeley)) • Evaluation becomes meticulous: why this group is more successful than others? • Facilitate scientific collaboration and innovation: how to use indicators, classifications, or science mapping to model expertise to enable better scientific networking? • Evaluation
Thanks • Questions & Answers