210 likes | 358 Views
Suggesting Friends Using the Implicit Social Graph. Maayan Roth 1 , Assaf Ben-David 1 , David Deutscher 2 , Guy Fisher 1 , Ilan Horn 2 , Ari Leichtberg 2 , Naty Leiser 2 , Yossi Matias 1 , Ron Merom 1 1 Google, Inc., Tel Aviv, Israel 2 Google, Inc., Haifa, Israel SIGKDD 2010
E N D
Suggesting Friends Using the Implicit Social Graph Maayan Roth1, Assaf Ben-David1, David Deutscher2, Guy Fisher1, Ilan Horn2, Ari Leichtberg2, Naty Leiser2, Yossi Matias1, Ron Merom1 1Google, Inc., Tel Aviv, Israel 2Google, Inc., Haifa, Israel SIGKDD 2010 2010. 11. 01. Summarized and Presented by Kim Chung Rim, IDS Lab., Seoul National University
Contents • Introduction • Problem Definition • Concept Definition • Goal • Various Score Measuring Algorithms • Experiment • Applications • Don’t Forget Bob! • Got the Wrong Bob? • Conclusion & Discussion
Introduction • Group communication is prevalent • 10% of e-mails are sent to more than one recipient, and 4% of e-mails are sent to 5 or more recipients. • Within enterprise domains, 40% of e-mails are sent to more than one recipient, and nearly 10% of e-mails are sent to 5 or more recipients. • User study show that they tend to communicate repeatedly with the same groups of contacts.
Problem Definition • However, users do not take the time to create and maintain custom contact groups. • The work of ‘creating groups manually’ is tedious and time-consuming. • Even if users create contact groups, it is likely to change dynamically over time.
Goal • The goal of this paper is to • Introduce the concept of Implicit social graph • Suggest a measurement to quantify interaction between users and contact group • Present a friend-suggestion algorithm that assists users in the creation of custom contact groups • Evaluate the friend-suggestion algorithm • Apply this novel friend-suggestion algorithm to practical use.
Concept Definition – Implicit Social Graph v5 v4 v6 v3 v2 v1 • a directed weighted Hypergraph • A graph, where • each node is an email address • each edge has weight and direction (incoming and outgoing mail) • each edge is a set of nodes (group of contacts)
Concept Definition – Egocentric Network v5 v4 v4 v6 v3 v3 v2 v2 v1 • Hypergraph composed of all the edges leading into or out of a single user node • No friend-of-friend hyeperedges are considered • Each hyperedge is defined as implicit group
Concept Definition – Interactions Rank • Interactions Rank • A metric to compute the weight of hyperedge • The weight has to satisfy following criteria • Frequency • groups with frequent interactions are more important • Recency • Interactions Rank is dynamic over time • Direction • Interactions that the user initiates are more significant than interactions that the user does not initiate
Concept Definition – Interactions Rank • Interactions Rank (IR) • : the set of outgoing interactions • : the set of incoming interactions • : current time • : timestamp of an Interaction • : half-life • : relative importance of outgoing vs. incoming interaction
Core Routine of Friend Suggest • S : a small set of contacts • G : a set of contact groups • g : a set of contacts with whom u has interactions • F : a set of scores for each contact [0,1] Returns a set of scores for contacts
Scoring Functions – base functions • Intersecting Group Count • Simply counts the number of groups that have intersection with the seed S and contains contact c at the same time. • Does not consider IR value of groups
Scoring Functions – base functions • Top Contact Score • Sums up all the IR values of the implicit groups containing each contact • Ignores seed and always suggests the top-ranked contacts
Scoring Functions • Intersecting Group Score • Sums up all the IR values of the implicit groups that have a non-empty intersection with the seed set and contains contact c at the same time • Finds all the context in which contact c exchanged emails or was a co-recipient with at least one seed group member
Scoring Functions • Intersection Weighted Score • However, more contacts in g intersect with S means higher degree of similarity • Taking this intuition into account, Intersection Weighted Score returns IR multiplied with a constant k and the size of intersection of g and S
Evaluation • Methodology • 10,000 email interactions with between 3 and 25 recipients are randomly sampled • All sampled email interactions are interactions by active user • A user who has minimum 5 implicit groups, sent at least one email within 7 days before sampled interaction • Each recipient list is a group of contacts that were implicitly clustered by the user • From that recipient list, few contact addresses are sampled and tested as seeds to see how well the rest addresses are recreated
Evaluation metric • Precision & Recall • Precision is the percent of correct suggestions out of the total number of contacts suggested for each seed group • Recall is the percent of correct suggestions out of the total number of email recipients who were not already members of the seed group
Applications: Don’t Forget Bob! • Don’t Forget Bob uses the Friend Suggest Algorithm • Once user has added at least two contact addresses, that user’s egocentric network is fetched from the implicit social graph • Friend Suggest generates up to 4 contacts who best expands the seed set of existing contacts.
Applications: Got The Wrong Bob? • Got The Wrong Bob is implemented to fix the auto-completion errors • For each contact in the current recipient list L,Wrong Bob excludes and builds a new seed set • When Friend suggest can restore , Wrong Bob stops to find a replacement • However, when cannot be restored, Wrong Bob searches for a replacement of
Conclusion & Discussion Introduce implicit social graph and Interactions Rank Define Friend Suggest Algorithm Propose two applications of the Friend Suggest algorithm Applicable to other types of communication