230 likes | 446 Views
User Profiling in Ego-network : Co-profiling Attributes and Relationships. Rui Li, Chi Wang, Kevin Chen- Chuan Chang University of Illinois at Urbana-Champaign. User Profiling , which infers users’ attributes, is important for P ersonalized S ervices. User. Personalized Search.
E N D
User Profiling in Ego-network:Co-profiling Attributes and Relationships Rui Li, Chi Wang, Kevin Chen-Chuan Chang University of Illinois at Urbana-Champaign
User Profiling, which infers users’ attributes, is important for Personalized Services User Personalized Search Search Engines Richard Targeted Advertisement Advertisers College: UIUC Location: Champaign and many others.
User Profiling is crucial for Social Analysis– Ability to survey the world Surveying people for behavior: • How do college students like iPad vs. Galaxy? • How do California age 50+ males like ObamaCare? Surveying behavior for people: • What demographics of users like Samsung more than Apple? • What communities of people support ObamaCare?
Can we profile users’ missing attributes in social network? Employer: Yahoo! College: Berkeley Employer: Twitter College:UIUC Employer: Google College:UIUC Employer: ? College:? Employer: ? College:? Employee: ? College:? Employer: Yahoo! College: Stanford Employer: Twitter College: Berkeley Employer: ? College:? Employee: JP Morgan College: UIUC Some users provide attributes in their online profiles Some users’ attributes are missing
Thus, we abstract our problem as profiling users' attributes based on friends’ attributes • Input: • a network G(V, E) , • some users’ attributes • Output: • users’ attributes Employer: Yahoo! College:UIUC Employer: Yahoo! College: Berkeley Employer: Twitter College:UIUC Employer: ? College:? Employer: ? College:? Employer: Yahoo! College: Stanford Employer: Twitter College: Berkeley Employer: ? College:? Employer: JP Morgan College: UIUC
While attributes may “propagate” across links—Links are very noisy. Existing methods simply assume that two connected users share the same value for any attribute Employer: Yahoo! College: Berkeley Employer: Twitter College:UIUC Employer: Google College:UIUC Employer: ? College:? Employer: ? College:? Employer: ? College:? Employer: Yahoo! College: Stanford Employer: Twitter College: Berkeley Employer: ? College:? Employer: JP Morgan College: UIUC However, users connect to friends with different values for an attribute • About 11% friends share the employer and 18% friends share the college. • Only 20% may have attributes.
Why noisy? Every link is for a (different) relationship! Users have different types of relationships in real life. Richard and Bob share the same employer, but may have different values for other attributes. Colleagues Bob College classmates Richard and Cindy share the same college, but may have different values for other attributes. Cindy Richard Richard and Peter share the same interests, but may have different values for other attributes. Club friends Peter
On the other hand, Relationship Profiling is necessary by itself, and similarly challenged! • Link: Why does a link happen? • Given a link, what friendship does it represent? • Circle: Who form what circles? • Where are my circles? • What does each circle represent? • Challenge: While links/circles depend on attributes to detect and to explain, attributes are often unknown.
Proposal: Co-profiling Attributes and Relationships • Attributes– properties of nodes • Relationships– properties of links • Together, understanding both nodes and links. Why together? 1. Necessity: Dependency on each other to decide. 2. Benefit: Useful to know both! Employer: Yahoo! College: Berkeley Missing Missing Employer: Google College: UIUC Employer: Yahoo! colleagues College: UIUC classmates
But how? Observing how attributes and relationships relate.
Insight: Correlation between attributes and connections through relationship Discriminative Correlation Insight : Attributes and connections are discriminatively correlated via a hidden factor -- relationship To concretize our insight, we explore two dependencies based on a real-world user study. • Attribute-Relationship Dependency: How users’ attributes are related to hidden relationship types? • Connection-Relationship Dependency: How connections are related to hidden relationship types?
Observation #1: Attribute-Relationship Dependency • Friends do not share all attributes. • What attributes they share depend on relationship. The percentages of friends sharing the same value with the ego for different attributes overall of different relationship types.
Observation #2: Connection-Relationship Dependency • Friends do not connect to all friends. • What friends they connect to depend on relationship. The average connections per user within and across three different relationships types
Specifically, we focus on co-profiling upon each user’s ego-network • Ego-network: a subnet that around an individual user. Association Vector w1=<1, 0, 0, 0, 0, 0, 0> w2=<0, 1, 0, 0, 0, 0, 0> Employer: Yahoo! College: Berkeley Employer: Twitter College:UIUC Circle 2:friends likely to share college Circle1:friends likely to share employee f4=<0, 1, 0, 0, 0, 1, 0 0.1> Attribute Vector f1=<1, 0, 0, 1, 0, 0, 0.1> Employer: Yahoo College:UIUC Employer: Google College:UIUC Employer: ? College:? x4=2 Circle Assignment x1=1 Employer: ? College:? Employer: Yahoo! College: Stanford Employer: ? College:? f3=<1, 0, 0, 0, 1, 0, 0.1> Employee: Twitter College: Berkeley x3=1 Circle 3:friends likely to share other attribute
Solution Overview: we realize co-profiling in an optimization framework Partially Observed User Attributes Observed User Connections Unobserved Friends’ circles • Cost Function: capturethe dependences between the variables based on the insight • Algorithm: finds the unknown variable that best satisfy the dependences
Cost Function: we design a cost function to model the dependencies between variables Attribute-Relationship (circle) Dependency Connection-Relationship Type (circle) Dependency However, the function can not be optimized directly, as there are both discrete and continuous variables There are other formulas to model the dependencies.
Algorithm: we minimize the function via updating each group of variables • Update User Attribute Vectors F • Only propagate values from friends in the same circles • Only propagate the attribute value associated with the circle • Update User Circle Assignments X • Cosider both user’s attributes and connections • Update Circle Association Vectors W • Make association vector sparse
Experiment: we first collect real-world ego-networks to evaluate our data set • We conduct user studies to collect users’ attributes and relationship types (circles) from LinkedIn. • Most users are have three attributes • 8K connection are labeled • We share the data online • https://wiki.engr.illinois.edu/display/forward/Dataset-EgoNetUIUC-LinkedinCrawl-Jan2014
Experiment: we evaluate our algorithm on both attribute and relationship type profiling • Attribute Profiling • APw: a classic collective classification approach, which profiles a node’s label using weighted votes from its neighbors. • APi: anther collective classification (semi-supervised learning) approach, which iteratively profiles nodes’ labels with APw. • APc: a state-of-art method, which profiles users’ attributers based on clustering network. • Relationship Type (circle) profiling • RPa: profiles friends’ circles based on their attributes. • RPn: profiles friends’ circles based on network structure • RPan: profiles friends’ circles based on network and attributes, but assumes attributes known.
CP is not only capable of profiling AP and RP and but also outperforms baselines for both
Summary: we made the following contributions in this problem • We propose a co-profiling approach that jointly profiles users’ attributes and relationship types (circles) in ego networks. • We present the discriminative correlation insight to capture the correlation between attributes and social connections. • We conduct extensive experiments to evaluate our algorithms on two tasks based on real-world ego networks.