Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard

Section 8 – Ec1818Jeremy Barofskyjbarofsk@hsph.harvard.edu March 31st and April 1st, 2010

Section 8 Outline (lectures 15, 16) • Social Network Introduction • Types of Networks / Graphs • Random • Regular • Small-world • Erdos / Bacon Numbers • Review Questions? • Evaluations • Office Hours - Thursday, 4/1/10 10-11am, outside 320 CGIS North.

Small World Phenomenon - Milgram • Question: Probability that two randomly selected people know each other? • Design: In 1967, Milgram sent packages to 160 random people living in Omaha NE asking them to send the package to a friend or acquaintance they thought might know or be connected to the final individual – a stock broker in Boston. (Postcards also sent back to Harvard to track progress). • Results: 1) Of those letters that found destination, average path length 5.5-6. 2) Significant selection bias – in one experiment 232 of 296 were not sent on. 3) Most of cards given to target through a few people. Experiment with 160 packages sent, 24 reached target at his home and 16 of these were given to target by one person (nodes in network). -Reasons for under-estimate or over-estimate of avg. path length?

Social networks • A graph G consists of a set V(G) of vertices (or nodes) together with a set of edges E(G) (or links) that connect vertices. • Degree: number of edges connected to a given vertex. • Order: the number of vertices V(G) in graph G represent its order. • Size: the number of edges E(G) in G represents its size. • Directed graph / undirected graph: graph is directed if all its edges are directional, ie- the network tells us not just whether people are friends but whether each person considers the other a friend. If none of edges are directional, then graph G is undirected.

Networks / Graphs and 3 elements

Social Networks Metrics • Characteristic path length L(G, p): measures average distance between vertices. By distance we mean the shortest path that connects vertices v and v’. • Clustering coefficient C(G, p): Measures a vertex / person’s level of cliquishness within its neighborhood. Answers – are the friends of my friends, my friends also? • Formally C(G, p)= actual edges in network within its neighborhood / maximum possible edges in that neighborhood. • Maximum number of graph edges / number of connections in social network: n(n-1)/2 where n = number of vertices.

Types of Graphs • Regular Network: each vertex is connected to same number k of their nearest neighbors only. All vertices have the same degree. Long characteristic path length because takes a long time to get from one vertex to another, large clustering coefficient because vertices connected to all other nearby vertices. • Random Network: Edges between vertices occur randomly with prob. = 1/V(G). Full connectedness occurs non-linearly when Pr(connection) = 1/V(G). Small characteristic path length and clustering coefficient. • Adjacency matrix: Way to represent network data with each row/ column representing whether those vertices have a connection.

From Regular to Random Graphs via Small Worlds

Regular -> Small World -> Random Graphs • Rewire: Start with a regular graph with vertices in a circle and each connected to 4 closest neighbors. Rewire each edge at random with probability p. • Changing p means tunes graph such that p = 0 defines a regular graph, p = 1 random. • Watts and Strogratz define small-world networks with two characteristics: • Large Clustering Coefficient C(G, p) – most of my neighbors are friends and friends with me too. • Small Characteristic Path Length L(G, p) – Presence of random, long-distance connections mean that moving from one part of the graph to the other can be done quickly.

Characteristic path length L(p) and clustering coefficient C(p) for rewired graphs as p varies. (Watts and Strogatz, 1998)

Empirical Examples (Watts and Strogatz, 1998)

Power Laws Again? Are you serious? (Random means normal distribution and small world means power law)

Erdos and Bacon Numbers • Small World Networks exhibit strong connections between neighbors (cliques) but information can still travel quickly because of random connections to other highly connected groups of vertices. • Erdos/ Bacon numbers: Level of connection in peer-reviewed journal articles or movie credits. Bacon number of 1 means individual acted in same movie as Kevin Bacon. Nearly all actors connected in this way – exhibits characteristics of small world networks.

Is Bacon Best? • “By processing all of the 1.6 million people in the Internet Movie Database I discovered that there are currently 506 people who are better centers than Kevin Bacon!” –Oracle of Bacon website. • Compute average Bacon number and compare to others.

Degree distribution of Bacon / Connery Numbers for Actors in IMD.

Midterm Questions?

Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard

Section 8 – Ec1818 Jeremy Barofsky jbarofsk@hsph.harvard

Presentation Transcript

Scorebook and Table Training

Chapter 12 Molecular Genetics

Section 1: Protecting Constitutional Rights Section 2: First Amendment Freedoms Section 3: Protecting Individual Liberti

Unit One

University of Nebraska

C H A P T E R

Splash Screen

Anesthesia for Cesarean Section

Section 2.1 Marketing Planning

Perl in a Day Peeking Inside the Oyster

Perl in a Day Peeking Inside the Oyster

Chapter Introduction Section 1 The Nile Valley Section 2 Egypt’s Old Kingdom

Section 1: The First Amendment: Your Freedom of Expression

Maryland Racing Plan

Chapter Menu

C H A P T E R

22.1 Section Objectives – page 577

Splash Screen

Unit 6 – Chapter 9

Eugene Demler Harvard University