260 likes | 362 Views
Identity and search in social networks. Duncan J. Watts, Peter Sheridan Dodds and M. E. J. Newman. Presented by Pooja Deodhar. Presentation Outline. Introduction Contentions – Social Networks Algorithm explanation Our model and Milgram’s findings Further Extensions Applications.
E N D
Identity and search in social networks Duncan J. Watts, Peter Sheridan Dodds and M. E. J. Newman Presented by Pooja Deodhar
Presentation Outline • Introduction • Contentions – Social Networks • Algorithm explanation • Our model and Milgram’s findings • Further Extensions • Applications
Introduction • Social Networks are “Searchable” • Our model offers explanation of searchability in terms of recognizable personal identities • Personal identities - sets of characteristics in different social dimensions • Class of searchable networks and method for searching them applicable to many real world problems
Introduction • Small World Network • Network in which most nodes are not neighbors of each other but most nodes can be reached from every other node by a number of hops
Source Introduction • Milgram’s Experiment • Short paths exist between individuals in large social network • Ordinary people can find these short paths • People rarely have more than local knowledge about the network
Introduction • Searchability • Property of being able to find a target quickly • Shown to exist in networks • With certain fraction of hubs (highly connected nodes which once reached can distribute messages to all parts of the network) • Built upon underlying geometric lattice
Introduction • Limited hubs in social networks • Social Networks are more like a peer-to-peer network • Need for a hierarchical model • Some measure of distance between individuals • Can be based on targets identity, friends identity, friend’s popularity
Contentions – Social Networks • Individual identities – sets of characteristics attributed to them by virtue of association, participation in social groups • Groups – Collection of individuals with well-defined set of social characteristics
Contentions – Social Networks • Breaking down of world into set of layers • Top layer – whole population • Lower layers – specific division into groups
Contentions – Social Networks • Similarity xij– between individuals i, j • xij– Height of the lowest common ancestor level between i and j • Individuals in same group are at distance of one from each other
Contentions – Social Networks • Combined social distance yij = minh xij • In the above figure H = 2 • In 1st heirarchy, yij = 1 and yjk = 1 in 2nd • But yik = 4 > yij + yjk = 2
Contentions – Social Networks • Probability of acquaintance between i and j decreases with decreasing similarity of groups to which they belong • Link distance x for individual i has probability p(x) = ce-αx • Measure of homophily – tendency of like to associate with like
Contentions – Social Networks • Individuals hierarchically partition the social world in more than one way. • h = 1, …, H hierarchies • Node’s identity is the vector • is position of node i in hierarchy h. • Social distance
Contentions – Social Networks • At each step the holder i of the message passes it to one of its friends who is closest to the target t in terms of social distance • Individuals know the identity vectors of: • themselves • their friends, • the target • Two kinds of partial information – social distance and network paths
Algorithm Explanation • Principal objective – determine conditions for average path length L of a message chain is small • Define q as probability of an arbitrary message chain reaching a target. • Searchable network - Any network for which q≥ r for a desired r.
Searchability • Searchable networks occupy a broad region of parameter space <α,H>which are sociologically plausible • Searchability is generic property of social networks
Algorithm Explanation • In terms of chain length L, q = (1 - p)L ≥ r L = length of message chain P = message failure probability • From above, L can be obtained by the approximate inequality, L <= ln r / ln (1 - p)
Our model and Milgram’s findings • All searchable networks have α > 0, H > 1 • Individuals are essentially homophilous but judge similarity along more than one social dimension • Best performance is achieved for H = 2 or 3 • Thus, use of 2 or 3 dimensions used by individuals in small world experiments when forwarding a message
Searchable Networks • Solid boundary – N=102,400 • Dot-dash – N=204800 • Dash – N=409,600 • p = 0.25, b = 2, g = 100, r = 0.25 at least
Our model and Milgram’s findings • Increasing number of independent dimensions from H = 1 yields dramatic reduction in delivery time for α > 0 • This improvement lost as H is increased further • Thus, network ties become less correlated as H increases • For large H, network becomes a random graph, search algorithm becomes random walk
Searchable Networks • Probability of message completion when for α = 0 (squares) and for α = 2 (circles) for N = 102,400 • Horizontal line – pos of the threshold • Open symbols indicate network is searchable – q <= r
Our model and Milgram’s data • n(L) – no. of completed chains of length L taken from original small world expt. (shown by bar graphs) • Taken for example of our model for N = 10^8 individuals and for 42 completed chains shown by filled circles
Our model and Milgram’s findings • Comparison of distribution of chain lengths in our model with that of Travers and Milgram • Avg. chain length for Milgrams expt = 6.5 • Avg. chain length for our model = 6.7
Summary • Simple greedy algorithm. • Represents properties present in real social networks: • Considers local clustering. • Reflects the notion of locality. • High-level structure + random links.
Further Extensions • Should we consider other parameters such as friend’s popularity information in addition to homophily? • Allow variation in node degrees? • Assume correlation between hierarchies? • Are all hierarchies equally important?
Applications • Broad class of decentralized problems • Peer to peer networking • Any data structure in which data elements can be judged along more than one dimension • Designing of databases • Eg. Music files – same genre/same year