Recommending Collaborators Using Keywords

SRS 2013 4’th International Workshop on Social Recommender Systems Co-located with WWW 2013 Rio de Janeiro, Brazil Recommending Collaborators Using Keywords

Collaborator Recommendation 2012 2013 Collaborator prediction/recommendation: Recommend Julia for Alice and the specific topic Recommend(Alice, ”Probability in Databases”) = {Julia} Classic link prediction/recommendation: Recommend Julia for Alice Recommend(Alice) = {Julia}

Motivation • A new researcher (such as myself ), can benefit from recommended collaborators on a desired topic according to the social graph • Important in cross-domain collaborations • Grant regulations • A foundation for the more generic context-based people recommendation / context-based link prediction problem, where given a source s and textual context k, we recommend/predict target nodes t for a link of context k

Take Home Message • A new problem variant • Define the collaboratorrecommendation problem • Not addressed before in the literature • Scoring functions • Empirical results for several structural, textual and importance based scoring functions • Two large real world DBLP based co-authorship networks • Results • A sophisticated hybrid score function based on structural and textual measures outperforms baseline • Our ranking function is effective

Problem Definition • Author node attributes • Profile(v):a bag of words of all publication titles • Co-authorship edge attributes • Label(e): publication title, Time(e): publication year Setting(e):publication venue/journal • A Query q=(s,k) • s:the source node in the network (the querying author) • k: set of keywords (e.g. desired topic or future publication title) • A function score(u,q) • score(u,q) > score(v,q) → u more likely to form a collaboration with s described by k

Structural Scoring Functions • Distance variants: Score(u,q) = 1/distance(s,u) • Simple distance • Weighted by time: weight(e) = 1/log(age(e)) where age(e) = current year – time(e) • Weighted by publication frequency: weight(e)=1/Mutual(e) where Mutual(e) = # of mutual publications for the authors of e • Adamic-Adar (Social Networks, 2001) • Score(u,q) = a weighted sum on the mutual neighbors of s and u Each mutual neighbor v weight is 1/log(N(v)), where N(v) is the number of neighbors of v

Textual Scoring Functions • TF-IDF • Score(u,q) = tf-idf(k , profile(u)) • COLLAB (developed in this paper) • Step 1: Score(u,q) = a weighted sum of u’s publications, considering: • Textual score for the publication for k • Publication age • Publication venue (did s publish in it?) • Publication participants (did s publish with them?) • Step 2: the unseen-bigrams approachon the results of step 1(Kleinberg et al., 2007)

Combining Scoring Functions • Linear Combinations • Re-ranking • Borda Normalizations • SocScore: a linear combination of re-rankings functions: • First ranking: structural (e.g time based distance, adamic-adar) • Second ranking: text based

Results – All Collaborators

Results – Only New Collaborators

Conclusion • We presented a novel problem definition • We examined scoring functions and their combinations, and developed an effective function • Future work: • Incorporating abstracts • Incorporate machine learning • Most promising: the generic context based link prediction/person recommendation problem

Thank you! Questions?

Recommending Collaborators Using Keywords

Recommending Collaborators Using Keywords

Presentation Transcript

Collaborators

Collaborators

Collaborators

Recommending Jesus

Using Keywords

USING KEYWORDS

Collaborators:

Collaborators:

Collaborators

Collaborators

Collaborators

COLLABORATORS

Collaborators

Collaborators:

Collaborators:

Collaborators

Collaborators:

Collaborators

Collaborators

Collaborators :

Collaborators:

Collaborators