250 likes | 588 Views
Agenda: Thursday, Feb 3. Midterm date: Thursday, March 3 New readings in Watts Our navigation experiment: some analysis Brief introduction to graph theory. News and Notes: Tuesday Feb 8. From the Field: NY Times article 2/8 on hate groups on Orkut Duncan Watts talk Friday Feb 11 at noon!
E N D
Agenda: Thursday, Feb 3 • Midterm date: Thursday, March 3 • New readings in Watts • Our navigation experiment: some analysis • Brief introduction to graph theory
News and Notes: Tuesday Feb 8 • From the Field: NY Times article 2/8 on hate groups on Orkut • Duncan Watts talk Friday Feb 11 at noon! • No MK office hours tomorrow • Return of NW Construction, Task 1: • first of all, staple your own work • grading: • 2/2: proceed as described • 1/2: some problems, usually of specificity • 0/2: fundamental flaw or lack of clarity • if you received 2/2: leave your assignment here • if you received 1/2: leave your assignment here, or revise and return on Thursday • if you received 0/2: revise and return on Thursday • Next Tuesday’s class: • MK out of town, but mandatory class experiment • once again, print and bring your Lifester neighbor profiles • Today’s agenda: • further analysis of Lifester NW navigation experiment • quick review and completion of Intro to Graph Theory • start on Social Network Theory
Description of the Experiment • Participation is mandatory and for credit • If you don’t have your Lifester neighbor profiles, you cannot participate • unless you have memorized your neighbor info • We will play two rounds • In each round, each of you will be the source of a navigation chain • You will be given a destination user to route a form to • Give the form to one of your Lifester neighbors who you think is “closer” to the target • Write your Lifester UserID on forms you receive, and continue to forward them towards their destinations • Points will be deducted for violations of the neighborhood structure • In one round, you will be given the Lifester profile of the destination • In the other round, you will not be given the destination profile • Then we’ll do some brief analysis with more detail to follow
diameter: worst-case: 5 average: 2.86
With destination profile: optimal mean = 3.67 class mean = 5.18 delta = 1.51 2 cycles Without destination profile: optimal mean = 3.6 class mean = 5.48 delta = 1.86 4 cycles
degree vs. betweenness, class chains number of chains degree of user
degree vs. betweenness, optimal chains number of chains degree of user
A Brief Introduction to Graph Theory Networked Life CSE 112 Spring 2005 Prof. Michael Kearns
Undirected Graphs • Recall our basic definitions: • set of vertices denoted 1,…N; size of graph is N • edge is an (unordered) pair (i,j) • (i,j) is the same as (j,i) • indicates that i and j are directly connected • a graph G consists of the vertices and edges • maximum number of edges: N(N-1)/2 (order N^2) • i and j connected if there is a path of edges between them • all-pairs shortest paths: efficient computation via Dijkstra's algorithm (another) • Subgraph of G: • restrict attention to certain vertices and edges between them • Connected components of G: • subgraphs determined by mutual connectivity • connected graph: only one connected component • complete graph: edge between all pairs of vertices
Complexity Theory in One Slide N^3 N^2 computation time computation time polynomials: tractable linear functions: tractable size of graph size of graph 2^N computation time exponential: intractable size of graph • 1000^2 = 1 million • 2^1000: not that many atoms! • most known problems: • either low-degree polynomial… • … or exponential
Cliques and Independent Sets • A clique in a graph G is a set of vertices: • informal: that are all directly connected to each other • formal: whose induced subgraph is complete • all vertices in direct communication, exchange, competition, etc. • the tightest possible “social structure” • an edge is a clique of just 2 vertices • generally interested in large cliques • Independent set: • set of vertices whose induced subgraph is empty (no edges) • vertices entirely isolated from each other without help of others • Maximum clique or independent set: largest in the graph • Maximal clique or independent set: can’t grow any larger
Some Interesting Properties • Computation of cliques and independent sets: • maximal: easy, can just be greedy • maximum: difficult --- believed to be intractable (NP-hard) • computation time scales exponentially with graph size • however, approximations are possible • Social design and Ramsey theory: • suppose large cliques or independent sets are viewed as “bad” • e.g. in trade: • large clique: too much collusion possible • large independent set: impoverished subpopulation • would be natural to want to find networks with neither • Ramsey theory: may not be possible! • Any graph with N vertices will have either a clique or an independent set of size > log(N) • A nontrivial “accounting identity”; more later
Graph Colorings • A coloring of an undirected graph is: • an assignment of a color (label) to each vertex • such that no pair connected by an edge have the same color • chromatic number of graph G: fewest colors needed • Example application: • classes and exam slots • chromatic number determines length of exam period • Here’s a coloring demo • Computation of chromatic numbers is hard • (poor) approximations are possible • Interesting fact: the four-color theorem for planar graphs
Matchings in Graphs • A matching of an undirected graph is: • a subset of the edges • such that no vertex is “touched” more than once • perfect matching: every vertex touched exactly once • perfect matchings may not always exist (e.g. N odd) • maximum matching: largest number of edges • Can be found efficiently; here is a perfect matching demo • Example applications: • pairing of compatible partners • perfect matching: nobody “left out” • jobs and qualified workers • perfect matching: full employment, and all jobs filled • clients and servers • perfect matching: all clients served, and no server idle
Cuts in Graphs • A cut of a (connected) undirected graph is: • a subset of the edges (edge cut) or vertices (vertex cut) • such that the removal of this set would disconnect the graph • min/maximum cut: smallest/largest (minimal) number • computation can be done efficiently • Often related to robustness of the network • small cuts ~ vulnerability • edge cut: failure of links • vertex cut: failure of “individuals” • random versus maliciously chosen failures (terrorism)
Spanning Trees • A spanning treeof a (connected) undirected graph is: • a subgraph G’ of the original graph G • such that G’ is connected but has no cycles (a tree) • minimum spanning tree: fewest edges • computation: can be done efficiently • Minimal subgraphs needed for complete communication • Different spanning tree provide different solutions • Applications: • minimizing wire usage in circuit design
Directed Graphs • Graphs in which the edges have a direction • Edge (u,v) means u v; may also have (v,u) • Common for capturing asymmetric relations • Common examples: • the web • reporting/subordinate relationships • corporate org charts • code block diagrams • causality diagrams
Weighted Graphs • Each edge/vertex annotated by a weight or capacity • Directed or undirected • Used to model • cost of transmission, latency • capacity of link • hubs and authorities (Google PageRank algorithm) • Common problem: network flow, efficiently solvable
Planar Graphs • Graphs which can be drawn in the plane with no edges crossing (except at vertices) • Of interest for • maps of the physical world • circuit/VLSI design • data visualization • Graphs of higher genus • Planarity testing efficiently solvable
Bipartite Graphs • Vertices divided into two sets • Edges only between the two sets • Example: affiliation networks • vertices are individuals and organizations • edge if an individual belongs to an organization • Men and women, servers and clients, jobs and workers • Some problems easier to compute on bipartite graphs
We’ll make use of these graph types… but will generally be looking at classes of graphs generated according to a probability distribution, rather than obeying some fixed set of deterministic properties.