630 likes | 1.13k Views
Theory!. Engage brain ;). Network Analysis: Social Networks. Network Analysis. ‘Network’ is a heavily overloaded term- >‘network analysis’ means different things to different people. Specific forms of network analysis are used in the study of diverse structures such as:
E N D
Theory! Engage brain ;)
Network Analysis • ‘Network’ is a heavily overloaded term- >‘network analysis’ means different things to different people. • Specific forms of network analysis are used in the study of diverse structures such as: • aspects of the Internet, • interlocking directorates, • transportation systems, • epidemic spreading, • metabolic pathways, • the Web graph, • electrical circuits, etc. • There is, however, a broad methodological foundation which is quickly becoming a prerequisite for researchers and practitioners working with network models. - “Network Analysis” -Brandes, Ulrik; Erlebach, Thomas (Eds.)
Hint: how to make this theory easier ... • Understand first – get the idea straight. Then look for the formal definitions and terms. • Draw a mind map during the lectures. • Keep track of all new terminology. Learn it! • Organize your mental framework.
Networks and Graphs • Mathematically, social (and other) networks can be represented as graphs or matrices (different to the charts and graphs you created in Excel) • Representing a problem as a graph: • can provide a different point of view • can make a problem much simpler and provide the appropriate tools for solving the problem http://www.instituteforadvancedstudies.org.uk/Portals/50/ComplexNetworks_Jan/Estrada_tutorial1.ppt
Terminology • There are two components to a graph: • Nodes and edges • Nodes also called points, vertices, actors (only in social networks) • Edges also called ties, links, arcs (digraphs)) • As a broad generalisation, in social networks: • People are nodes and interactions between people are edges
Formal definition • A graph is defined as a set of nodes (a,b,c,d,e) and a set of links ({a,b}, {b,c}, {c,d}, {d,a}, {a,e}) that connect the nodes. • This is sometimes written mathematically as G=(V,E) or G(V,E). The V and the E stand for vertices (nodes) and edges (links/ties) http://www.analytictech.com/mb021/graphtheory.htm
Social Network Analysis • Social network analysis [SNA] is the mapping and measuring of relationships and flows between people, groups, organizations, computers, URLs, and other connected information/knowledge entities. • The nodes in the network are the people and groups while the edges show relationships or flows between the nodes. • SNA provides both a visual and a mathematical analysis of human relationships. Management consultants use this methodology with their business clients and call it Organizational Network Analysis [ONA]. http://www.orgnet.com/sna.html
Social Network Analysis • Extremely fast growing field: • "supplier" to sociology, anthropology, psychology, management, health, etc • "client" of graph theory, algebra, statistics, sociometry, psychometry • Most complex systems are graph-like http://www.analytictech.com/networks/whatis.htm
Social Network Examples:Friendship http://www.instituteforadvancedstudies.org.uk/Portals/50/ComplexNetworks_Jan/Estrada_tutorial1.ppt
Social Network Examples:Scientific Collaboration http://www.instituteforadvancedstudies.org.uk/Portals/50/ComplexNetworks_Jan/Estrada_tutorial1.ppt
Social Network Examples:Business ties in US biotech-industry http://www.instituteforadvancedstudies.org.uk/Portals/50/ComplexNetworks_Jan/Estrada_tutorial1.ppt
Social Network Analysis • Problems: • IRL - lack of sufficient computing resources to handle large datasets. Problematic to bound a social network IRL, E.G. If we are looking at needle-sharing among drug users, we can artificially bound the network at some arbitrary boundary, such as city or neighbourhood, but this distorts the data • Online – SNSs allow us to use groups and communities membership lists to create boundaries • criticised as too theoretical but a proper understanding gives enormous power to control people and events, as well as understanding of history (e.g. rise of Moscow in 12th century Russia in terms of betweenness centrality) http://www.analytictech.com/networks/whatis.htm
Basic Graph Theory • Graph theory gives us the tools to analyse social networks • The length of the lines and the shape of the graph doesn’t usually mean anything because all it is representing is that there is or is not a relationship • These three graphs show the exact same network:
Example A network depicting the sites on the Internet, then known as the Arpanet, inDecember 1970. (Image from F. Heart, A. McKenzie, J. McQuillian, and D. Walden [215]; on-line at http://som.csudh.edu/cis/lpress/history/arpamaps/.)
Examples An alternate drawing of the 13-node Internet graph on the previous slide – looks different but expresses the same thing.
More on graphs • The nodes in a graph represent persons (or animals, organizations, cities, countries, etc) and the lines (edges) represent relationships among them. The edge between persons a and b is represented mathematically like this: (a,b). • The network drawn below contains these edges: (a,b), (a,e), (a,d), (b,c), and (d,c). http://www.analytictech.com/mb021/graphtheory.htm
Mathematically: • V is a finite set, called the vertices of G, • E is a finite set, called the edges of G, and • φ is a function with domain E and codomain P2(V ).
Degree, & the handshaking lemma • The degree of a node is defined as the number of lines incident upon that node. The degree of B is 6 because it has 6 links • The handshaking lemma is the statement that every finite undirected graph has an even number of nodes/vertices with odd degree Name comes from mathematical proof that in a party of people some of whom shake hands, an even number of people must have shaken an odd number of other people's hands (think about it ;) ).
Isolates and pendants • How would you add an isolate to this graph? • In a SNS on the web, when would you typically be an isolate and for how long? • What is a pendant? Locate one on the graph.
Undirected graphs • There are different kinds of graphs, e.g. directed and undirected • undirected graphs: edges have no direction. For example below, there is a relationship between a and b, and this is the same thing as saying there is a relationship between b and a. We could refer to the edge as (a,b) or (b,a) -- it makes no difference.
Directed graphs • In directed graphs (also known as digraphs), the edges do have direction. In such cases, we typically draw the graph with arrowheads, and refer to the lines as "arcs". • Example below records the social relation "who likes whom". Persons b, d and e all say they like person a. Note that person does not say they like d or e, but they do reciprocate with b. Nobody says they like e.
More on graphs • In a directed graph, a point has both indegree and outdegree. The outdegree is the number of arcs from that point to other points. E.G., below the outdegree of node a is 1. • The indegree is the number of arcs coming in to the point from other points. E.G., below the indegree of node a in is 3.
Paths • A path is an alternating sequence of points and lines, beginning at a point and ending at a point, and which does not visit any point more than once
Walks • A Walk is an alternating sequence of nodes and edges, beginning and ending with a vertex • A walk is closed if its first and last vertices are the same, and open if they are different. An open walk is also called a path.
Examples (a) www.airlineroutemaps.com/USA/Northwest Airlines asiapacic.shtml, (b) www.wmata.com/metrorail/systemmap.cfm, (c) www.cs.cornell.edu/ugrad/owchart.htm
Examples • The depictions of airline and subway systems in (a) and (b) are examples of transportation networks, in which nodes are destinations and edges represent direct connections. Much of the terminology surrounding graphs derives from metaphors based on transportation through a network of roads, rail lines, or airline flights
Examples • The prerequisites among college courses in (c) is an example of a dependency network, in which nodes are tasks and directed edges indicate that one task must be performed before another. • The design of complex software systems and industrial processes often requires the analysis of enormous dependency networks, with important consequences for efficient scheduling in these settings.
Examples • The Tank Street Bridge from Brisbane, Australia shown in (d) is an example of a structural network, with joints as nodes and physical linkages as edges. • The internal frameworks of mechanical structures like buildings, vehicles, or human bodies are based on such networks
Connectedness and reachability • A graph is connected if there exists a path (of any length) from every node to every other node. The longest possible path between any two points in a connected graph is n-1, where n is the number of nodes in the graph • A node is reachable from another node if there exists a path of any length from one to the other
Examples • A graph with 3 connected components. The graph as a whole is NOT connected. • a connected component of a graph is a subset of the nodes such that: • (i) every node in the subset has a path to every other; and • (ii) the subset is not part of some larger set with the property that every node can reach every other
Examples • A network in which the nodes are students in a large American high school, and an edge joins two who had a romantic relationship at some point during the 18-month period in which the study was conducted
Geodesic distance • The graph-theoretic or geodesic distance between two points is defined as the length of the shortest path between them. • Thus, a geodesic path is the shortest path between 2 points • If something is flowing through a network (such as gossip, or a disease), the time that it takes to get from one point to another is partly a function of the graph-theoretic distance/length of the geodesic path between them.
Pivotal nodes • We say that a node X is pivotal for a pair of distinct nodes Y and Z if X lies on every geodesic (shortest) path between Y and Z (and X is not equal to either Y or Z) • E.G. in graph below, node B is pivotal for two pairs: the pair consisting of A and C, and the pair consisting of A and D. • B is not pivotal for the pair consisting of D and E – why? • Is node D is not pivotal for any pairs?
Exercises • Give an example of a graph in which every node is pivotal for at least one pair of nodes. Explain your answer. • Give an example of a graph in which every node is pivotal for at least two different pairs of nodes. Explain your answer. • Give an example of a graph having at least four nodes in which there is a single node X that is pivotal for every pair of nodes (not counting pairs that include X). Explain your answer.
Centrality • The centrality of a node in a network is a measure of the structural importance of the node. • A person's centrality in a social network affects the opportunities and constraints that they face. • There are three important aspects of centrality: • degree/activity, • betweenness, and • closeness.
Centrality • Degree/Activity – the number of direct connections a node has, i.e. who knows whom. • the greater a person's degree, the more potential influence they have on the network, and vice-versa (good/bad -> gossip, viruses)
Centrality • Here Diane has the most direct connections, making hers the most active node in the network. She is a 'connector' or 'hub' in this network.
Centrality • Degree/Activity • Often "the more connections, the better”, but not always. • Where do those connections lead, how do they they connect the otherwise unconnected? • Diane has connections only to others in her immediate cluster -- her clique. She connects only those who are already connected to each other.
Centrality • Betweenness • Loosely speaking, betweenness centrality is defined as the number of geodesic paths that pass through a node (remember that a geodesic path is the shortest path between 2 points) • It is the number of "times" that any route needs go through a given node to reach any other node by the shortest path, i.e. for a given node it is the number of shortest paths in the network that pass through that node
Calculating betweenness • Betweenness with respect to specific nodes = #geodesic paths through node/total #geodesic paths Olaf’s Betweenness Score with respect to Eliza and Latisha: • Geodesic paths from Eliza to Latisha: EJOBL and EAOBL • Geodesic paths from Eliza to Latisha through Olaf: EJOBL and EAOBL • Olaf’s betweeness score with respect to Eliza and Latisha is 1.
Calculating betweenness • Betweenness with respect to specific nodes = #geodesic paths through node/total #geodesic paths Anna’s Betweenness Score with respect to Eliza and Latisha: • Geodesic paths from Eliza to Latisha: EJOBL and EAOBL • Geodesic paths from Eliza to Latisha through Anna: EAOBL • Anna’s betweeness score with respect to Eliza and Latisha is 1/2.
Centrality • Betweenness • A node that has high betweenness can control the flow of information, acting as a gatekeeper. (e.g., executive secretaries, power) • benefits to being in the middle: • the information benefit from being plugged into different camps or regions of the network – e.g. hear different versions of the story, more data to be able to predict outcomes • the control benefit of being able to play one person against the other.
Centrality • Betweenness • Diane has many direct ties. Heather has few direct fewer than the average in the network but has one of the best locations in the network -- she is between two important constituencies, plays a 'broker' role in the network. • Also single point of failure -great influence over what flows and does not in the network Location, Location, Location."
Betweenness: this example shades nodes with hues from Red (least) to Blue (most betweenness). [Wikipedia]
Centrality • Closeness • Closeness centrality is defined the inverse of the sum of the shortest distances between each individual and every other person in the network
Centrality • Closeness of Olaf d(O,B) = 1 d(O,T) = 2 d(O,L) = 2 d(O,A) = 1 d(O,J) = 1 d(O,D) = 1 d(O,G) = 2 d(O,E) = 2 d(O,R) = 2
Centrality Sum the reciprocals: – 1/1 + 1/2 + 1/2 + 1/1 + 1/1 + 1/1 + 1/2 + 1/2 + 1/2 – Olaf’s closeness score is 6.5. • Closeness of Olaf d(O,B) = 1 d(O,T) = 2 d(O,L) = 2 d(O,A) = 1 d(O,J) = 1 d(O,D) = 1 d(O,G) = 2 d(O,E) = 2 d(O,R) = 2