410 likes | 856 Views
Social Network Analysis. BY Hani Maher Ahmad . What is Social Network. Social Network is heterogynous and multirelational data set represented by graph Social networks need not to be social in context Examples : Electrical power grids The web Coauthorship.
E N D
Social Network Analysis BY Hani Maher Ahmad
What is Social Network • Social Network is heterogynous and multirelational data set represented by graph • Social networks need not to be social in context • Examples : • Electrical power grids • The web • Coauthorship
Why do we study Social Network • Small world effect & universal behavior • 100th Monkey effect & tipping behavior • It is a complex Dynamical System • More information in Data Mining • Links information and structure of data are involved in the mining process • More realistic applications • New types of patterns (e.g. link prediction)
What do you thinkSmall world Experiment • People in city X are asked to direct message to stranger in city Y • By forwarding it to friend they think he know the stranger • What is the number of intermediate peoples links until message is received?
Small World • It is a graph have high degree of local clustering • Six degree of separation E.g. Science Coauthorship Graph
What do you think • Why there are a sudden events in our life? • How dose on product or movie or idea spread at once ? • Why do the most smart students became smart suddenly ? • Why dose we change our mind suddenly?
Evolution of a Random Network • We have a large number n of vertices • We start randomly adding edges one at a time • At what time t will the network: • have at least one “large” connected component? • have a single connected component? • have “small” diameter?
crime rate size of police force Formalizing Familiar Ideas • Explaining universal behavior through statistical models • our models will always generate many networks • almost all of them will share certain properties (universals) • Explaining tipping through incremental growth • we gradually add edges • many properties will emerge very suddenly during this process prob. NW connected number of edges
How to study SN • Random graph generation models E.g. Forest Fire model 1. chooses an ambassador node w. 2. selects x links incident to w randomly . Let w1;w2; …;wx denote the nodes at the other end of the selected edges. 3. Our new node, v, forms out-links to w1;w2; …;wx and then applies step 2 recursively to w1;w2;…;wx. The process continues until it dies out
How to study SN • The models are realistic and tell how the reality will be • It is seen that rich become richer • But it is blind and cant tell how things happen exactly • Very hard to predict exactly since most of the problems are NP-hard
Dynamical System • It is a state and a rule changing that state • E.g. pupation number is a state and logistic growth is a rule
Dynamical Systems • Dynamical Systems has important property of attractors (points of stability )
Dynamical Systems • But some times a chaotic behavior or divergence occur like when traffic network become stuck • We study social network to control its stability and prevent chaos
Social Network Characteristics • Densification power Law • Number of edges grows exponentially with number of nods • Shrinking diameter • The effective diameter of network shrink with network growth • Heavy-tailed out degree and in degree • The number of out and in degree follow the heavy tail distribution
What do you think • What are things can be mined from social network? • What is the difference and similarity of data and network mining ? • Dose the graph need to be labeled or not ? • Dose the graph need to be directed or not ? • Can we mine the graph for all and exact patterns?
Link Mining tasks • Link based object classification • Category is classified based on links and attributes (generalize data classification) • Object type prediction • Link type prediction • Predicting link existence • Link cardinality estimation
Link mining tasks • Object reconciliation • To detect if two objects are the same • E.g. if two desires are the same or two paper sites are the same • Group detection • Sub graph detection What is the difference between 7 and 8 ?
Are you looking for her • Who is the most perfect woman ? • In other word how can we find the most valuable object in the network and how can we find the rank of an object ? • How can we find her prestige .
Representing Network in suitable way for computation • We can represent the graph with matrices • Adjacency matrix • the rows and columns represent nodes with entries equal 1 if there is an edge and 0 else • Incidence matrix • the rows and columns represent nodes and edges with entries equal 1 if the edge is incident to node and 0 else
Three algorithms • Prestige algorithm • Page rank • HITS authority and hubs Note : the first two compute the prestige vectored of the network representing the prestige of each node the third algorithm compute the hub score vector and authority score vector
Prestige algorithm • The prestige of a node depend on the prestige of nodes pointing to it. • That is for node i : P[i] = AT[i].P • sum of nodes pointing to it * there prestige • For all nodes P = AT.P • Starting from all prestige in the beginning = 1 • Apply the multiplication until converge • i.e. Pt+1 = AT.Pt
Page Rank Algorithm • For node prestige dose not depend only on the prestige of nodes pointing to it but also on a randomly chosen nodes • Random surfing model: • At any page, • With prob. , randomly jumping to a page • With prob. (1 – ), randomly picking a link to follow • Page rank = prestige + random walk
Page Rank Algorithm • Note that the adjacency matrix is normalized • This is the main algorithm behind google
HITS Algorithm • This algorithm give two ranks to the node . • As authority if it has been pointed to by many good hubs • and hub if it point to many good authorities.
Application : Viral Marketing • The marketing has many models • Direct marketing : • based on customer attributes • classification problem • Massive marketing : • based on the population segment the person belong to • clustering problem • have advantage that it capture indirect costumers • Viral marketing : • massive marketing + optimize word of mouth effect
Application : Viral Marketing • E.g. a person how buy a car motivate his friends to buy a car • Aim is to find Network value of person • If the person is a good hub it is potential customer that can maximize the network profit so spend more money in marketing product to him • If the person have negative effect don’t market to him
Application : Viral Marketing • Viral Marketing can be used in non marketing tasks • E.g. • Fighting teenage smoking • Stopping virus spread • Spread an idea • marketing for a Political men “e.g. election”
So What do you think • She has the best authority score all “hubs” are pointing to her. • Is it a good idea to marry her ?? Yes or NO
What do you think • She can have best authority because of • Rich become richer • Some tipping phenomena • She is modda • She have more hubs • Because of butterfly effect and divergence • She can appear due to marketing effort • She also can be good authority
So What do you think • Google use the page rank and HITS do you think that the result are perfect or just popular • Dose that make sense when working with real people in the real world • So for me it is Big NO
Social Networks out of control • If the social network is not controlled • Rich will become richer and all the capital will accumulate with him • Most people like the wrong things due to joy of adrenaline and self prodding • Many stuck in the relation ships will occur as bas ideas , drugs , bad practices spreading • Many silly persons will appear as authority due to there strange or bad ideas
Social Networks out of control • The number of links will become Extremely large making life harder and noisy and much loose in time • The diameter will shrink making the spy and crimes easy • Some tipping events will destroy the society • More effort will be on marketing instead of industry • The civilization will stop and we only will focus on communication
Social Networks out of control • Hidden persons can control the network and affection others by making adjusting links and spreading ideas “program the SN” to there benefits • It is not proved but I guess a sudden death of the network will occur . “ we are running into Chaos”
References • The text book • The another slides • Dr Mohammed Zaki lectures “one of the leading data mining researcher” • http://www.cs.rpi.edu/~zaki/www-new/pmwiki.php/Dmcourse/Main • SATNAM ALAG : Collective Intelligence in Action • Wikipedia : small worlds , social networks articles • Kathleen T. Alligood : CHAOS An Introduction to Dynamical Systems
Thank you Questions ???