320 likes | 449 Views
Analysis of Social Media MLD 10-802, LTI 11-772. William Cohen 1-25-010. Recap: What are we trying to do?. Like the normal curve: Fit real-world data Find an underlying process that “explains” the data Enable mathematical understandingl (closed-form?)
E N D
Analysis of Social MediaMLD 10-802, LTI 11-772 William Cohen 1-25-010
Recap: What are we trying to do? • Like the normal curve: • Fit real-world data • Find an underlying process that “explains” the data • Enable mathematical understandingl (closed-form?) • Modelssome small but interesting part of the data
Graphs • Some common properties of graphs: • Distribution of node degrees • Distribution of cliques (e.g., triangles) • Distribution of paths • Diameter (max shortest-path) • Effective diameter (90th percentile) • Connected components • … • Some types of graphs to consider: • Real graphs (social & otherwise) • Generated graphs: • Erdos-Renyi“Bernoulli” or “Poisson” • Watts-Strogatz “small world” graphs • Barbosi-Albert “preferential attachment” • …
Graphs • Some types of graphs to consider: • Real graphs (social & otherwise) • Generated graphs: • Erdos-Renyi“Bernoulli” or “Poisson” • Watts-Strogatz “small world” graphs • Barbosi-Albert “preferential attachment” • … All pairs connected with probability p
Graphs • Some types of graphs to consider: • Real graphs (social & otherwise) • Generated graphs: • Erdos-Renyi“Bernoulli” or “Poisson” • Watts-Strogatz “small world” graphs • Barbosi-Albert “preferential attachment” • … Regular, high-homophily lattice Plus random “shortcut” links
Graphs • Some types of graphs to consider: • Real graphs (social & otherwise) • Generated graphs: • Erdos-Renyi“Bernoulli” or “Poisson” • Watts-Strogatz “small world” graphs • Barbosi-Albert “preferential attachment” • … New nodes have m neighbors High-degree nodes are preferred “Rich get richer”
Graphs • Some common properties of graphs: • Distribution of node degrees • Distribution of cliques (e.g., triangles) • Distribution of paths • Diameter (max shortest-path) • Effective diameter (90th percentile) • Connected components • … • Some types of graphs to consider: • Real graphs (social & otherwise) • Generated graphs: • Erdos-Renyi“Bernoulli” or “Poisson” • Watts-Strogatz “small world” graphs • Barbosi-Albert “preferential attachment” • …
Graphs • Some common properties of graphs: • Distribution of node degrees • Distribution of cliques (e.g., triangles) • Distribution of paths • Diameter (max shortest-path) • Effective diameter (90th percentile) • Connected components • …
Graphs • Some common properties of graphs: • Distribution of node degrees • Distribution of cliques (e.g., triangles) • Distribution of paths • Diameter (max shortest-path) • Effective diameter (90th percentile) • Connected components • … • In a big Erdos-Renyi graph this is very small (1/n) • In social graphs, not so much • More later…
Graphs • Some common properties of graphs: • Distribution of node degrees • Distribution of cliques (e.g., triangles) • Distribution of paths • Diameter (max shortest-path) • Effective diameter (90th percentile) • Mean diameter • Connected components • … • In a big Erdos-Renyi graph this is small (logn/logz) • In social graphs, it is also small (“6 degrees”)
Graphs • Some common properties of graphs: • Distribution of node degrees • Distribution of cliques (e.g., triangles) • Distribution of paths • Diameter (max shortest-path) • Effective diameter (90th percentile) • Mean diameter • Connected components • … • In a big Erdos-Renyi graph there is one “giant connected component”… • … because two giant connected components cannot co-exist for long.
n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a n/a Poor fit
More terms • Centrality and betweenness: how does your position in a network affect what you do and how you do it? • And how can we define these precisely? • High centrality: ringleaders? • High betweenness: go-between, conduit between different groups? • “Structural hole” • Group cohesiveness: number of edges within a (sub)group
More terms • Association network: bipartite network where nodes are people or organizations
Triads and clustering coefficients • In a random Erdos-Renyi graph: • In natural graphs two of your mutual friends might well be friends: • Like you they are both in the same class (club, field of CS, …) • You introduced them
Watts-Strogatz model • Start with a ring • Connect each node to k nearest neighbors • homophily • Add some random shortcuts from one point to another • small diameter • Degree distribution not scale-free • Generalizes to d dimensions
Even more terms • Homophily: tendency for connected nodes to have similar properties • Social contagion: connected nodes become similar over time • Associative sorting: similar nodes tend to connect • Disassociative sorting: vice-versa • Association network: bipartite network where nodes are people or organizations
A big question • Homophily: similar nodes ~= connected nodes • Which is cause and which is effect? • Do birds of a feather flock together? • Do you change your behavior based on the behavior of your peers? • Do both happen in different graphs? Can there be a combination of associative sorting and social contagion in the same graph?
A big question about homophily • Which is cause and which is effect? • Do birds of a feather flock together? • Do you change your behavior based on the behavior of your peers? • How can you tell? • Look at when links are added and see what patterns emerge (triadic closure): Pr(new link btwnu and v | #common friends)
Triadic closure T(k) = 1 – (1-p)^k T(k) = 1 – (1-p)^(k-1) • Pr(new link btwnu and v | #common friends)
Final example: spatial segregation • How picky do people have to be about their neighbors for homophily to arise? • Imagine a grid world where • Agents are red or blue • Agents move to a random location if they are unhappy • Agents are happy unless <k neighbors are the same color they are (k= • i.e., they prefer not to be in a small minority • What’s the result over time? • http://cs.gmu.edu/~eclab/projects/mason/projects/schelling/