440 likes | 450 Views
This article introduces the concept of cohesion in social networks, discussing the factors that contribute to cohesion and different ways to measure it. Examples from a study in Costa Rica are provided to illustrate the importance of community cohesion in society.
E N D
Selected Topics in Data Networking Explore Social Networks: Cohesion
Introduction to Cohesion • Social Network Analysis: • Investigating who is related and who is not. • Why are some people or organizations related, whereas others are not? • People who match on social characteristics will interact more often and people who interact regularly will foster a common attitude or identity.
Introduction to Cohesion • Social interaction • Basis for solidarity, shared norms, identity, and collective behavior • People who interact intensively are likely to consider themselves a social group. • Expecting similar people to interact a lot, at least more often than with dissimilar people. • This phenomenon is called homophily: love of the same (the tendency of individuals to associate and bond with similar others) • Does the homophily principle work?
Introduction to Cohesion • Study in the Turrialba region, which is a rural area in Costa Rica (Latin America). • Visual impression of the kin visits network and the family–friendship groupings, which are identified by the colors and numbers within the vertices
Meaning: Cohesion • Cohesion means a social network contains many ties. • Community cohesion refers to the aspect of togetherness and bonding exhibited by members of a community, the “glue” that holds a community together. • More ties between people yield a tighter structure, which is more cohesive. • The density of a network captures this idea. Source: https://digestiblepolitics.wordpress.com/2013/01/04/the-importance-of-community-cohesion-in-society-in-2013/
Meaning: Cohesion • Review • Multiple lines between vertices and higher line values indicate more cohesive ties. • Density = the number of edges divided by the number possible. • If self-loops are excluded, then the number possible is n(n-1)/2. • If self-loops are allowed, then the number possible is n(n+1)/2.
Cohesion: Indicated by Density • In the kin visiting relation network, density is 0.045, which means that only 4.5 percent of all possible arcs are present. • Density is inversely related to network size: • The larger the social network, the lower the density because the number of possible lines increases rapidly with the number of vertices, whereas the number of ties which each person can maintain is limited. • Discussion: Why? • Network density is not very useful
Cohesion: Indicated by Degree • The number of ties in which each vertex is involved. • Degree of a vertex. • Vertices with high degree are more likely to be found in dense sections of the network. • Review (Undirected Graph) Cohesion: Comparing between Density and Degree
Cohesion: Indicated by Degree • A higher degree of vertices yields a denser network • because vertices entertain more ties. • Average degree of all vertices: Measuring the structural cohesion of a network. • This is a better measure of overall cohesion than density • It does not depend on network size, so average degree can be compared between networks of different sizes.
Cohesion: Indicated by Degree • NOTE • Directed Graph: the sum of the indegree and the outdegree of a vertex does not necessarily equal the number of its neighbors
Cohesion: Indicated by Component • Vertices with a degree of one or higher are connected to at least one neighbor, so they are not isolated. • If the network is cut up in pieces. • Isolated sections of the network may be regarded as cohesive subgroups • because the vertices within a section are connected, whereas vertices in different sections are not. • The connected parts of a network are called components
Cohesion: Indicated by Component • “Singletons,” who have no connections and are least central • The “giant component,” which is the largest group of nodes tightly connected to the central nodes and to each other • The “middle region,” which represents isolated groups which interact amongst themselves but not with the rest of the network, forming isolated stars. Source: http://boxesandarrows.com/social-networks-and-group-formation/
Cohesion: Indicated by Component • A network is weakly connected – if all vertices are connected by a semipath. • In a (weakly) connected network, we can “walk” from each vertex to all other vertices if we neglect the direction of the arcs.
Cohesion: Indicated by Component • In directed networks, there is a second type of connectedness: a network is strongly connected if each pair of vertices is connected by a path. • In a strongly connected network, we can travel from each vertex to any other vertex obeying the direction of the arcs.
Cohesion: Indicated by Component • Strong connectedness is more restricted than weak connectedness: • Each strongly connected network is also weakly connected but a weakly connected network is not necessarily strongly connected.
Cohesion: Indicated by Component • Vertices v1, v3, v4, and v5 constitute a (weak) component • because they are connected by semipaths and there is no other vertex in the network which is also connected to them by a semipath.
Cohesion: Indicated by Component • A (weak) component is a maximal (weakly) connected subnetwork. • A strong component, which is a maximal strongly connected subnetwork.
Cohesion: Indicated by Component • The example network contains three strong components. • The largest strong component is composed of vertices v3, v4, and v5, which are connected by paths in both directions.
Cohesion: Indicated by Component • There are two strong components consisting of one vertex each, namely vertex v1 and v2. • Vertex v2 is isolated and there are only paths from vertex v1 but no paths to this vertex, so it is not strongly connected to any other vertex. • It is asymmetrically linked to the larger strong component.
Cohesion: Indicated by Component • In an undirected network, lines have no direction • Each semiwalk is also a walk and each semipath is also a path. • Components are isolated from one another, there are no lines between vertices of different components. • This is similar to weak components in directed networks.
Cohesion: Indicated by Component • Components can be split up further into denser parts by considering the number of distinct, that is, noncrossing, paths or semipaths that connect the vertices. • Within a weak component, one semipath between each pair of vertices suffices but there must be at least two different semipaths in a bi-component. • k-connected components: maximal subnetworksin which each pair of vertices is connected by at least k distinct semipaths or paths.
Cohesion: Indicated by Core • The distribution of degree reveals local concentrations of ties around individual vertices • but it does not tell us whether vertices with a high degree are clustered or scattered all over the network. • Using degree to identify clusters of vertices that are tightly connected • because each vertex has a particular minimum degree within the cluster.
Cohesion: Indicated by Core • Paying no attention to the degree of one vertex but to the degree of all vertices within a cluster. • These clusters are called k-cores • k indicates the minimum degree of each vertex within the core • Ex: 2-core contains all vertices that are connected by degree two or more to other vertices within the core.
Cohesion: Indicated by Core • A k-core identifies relatively dense subnetworks • so they help to find cohesive subgroups.
Cohesion: Indicated by Core • Undirected network: • the degree of a vertex is equal to the number of its neighbors, • k-core contains the vertices that have at least k neighbors within the core
Cohesion: Indicated by Core • All vertices belong to the 1-core, which is drawn in black • One vertex, v5, has only one neighbor, so it is not part of the 2-core • Vertex v6 has a degree of 2, so it does not belong to the 3-core • k-cores are nested: a vertex in a 3-core is also part of a 2-core, but not all members of a 2-core belong to a 3-core.
Cohesion: Indicated by Core • Different cohesive subgroups within a k-core are usually connected by vertices that belong tolower cores • Vertex v6, which is part of the 2-core, connects the two segments of the 3-core. • Eliminating the vertices belonging to cores below the 3-core, • Obtain a network consisting of two components, which identify the cohesive subgroups within the 3-core.
Cohesion: Indicated by Core • How k-cores help to detect cohesive subgroups? • Removing the lowestk-cores from the network until the network breaks up intorelatively dense components. • Each component is considered to be a cohesive subgroup • because they have at least k neighbors within the component.
Selected Topics in Data Networking Explore Social Networks: Prestige and Ranking
Introduction • Prestige is conceptualized as a particular pattern of social ties. • In directed networks, people who receive many positive choices are considered to be prestigious. • If everybody likes to play with the most popular girl or boy in a group but he or she does not play with all of them. Source: http://pursuitist.com/lady-gaga-rules-twitter/
Introduction • Popularity and Indegree: Prestige • When ties are associated to some positiveaspects such as friendship or collaboration, • indegree is often interpreted as a form of popularity, • outdegree is interpreted as gregariousness. • A prestigious art museum receives more attention from art critics than less prestigious ones. Source: http://en.wikipedia.org/wiki/Centrality
Introduction • The simplest measure of structural prestige is called popularity and it is measured by the number of choices a vertex receives: its indegree • In undirected networks, we cannot measure prestige; instead, we use degree as a simple measure of centrality.
Introduction • We should note that indegree does reflect prestige if we transpose the arcs in such a network, that is, if we reverse the direction of arcs. • It is interesting to note that several structural properties of a network do not change when the arcs are transposed
Correlation • Structural prestige scores: Correlation coefficients range from -1 to 1 • A positive coefficient indicates that a high score on one feature is associated with a high score on the other (e.g., high structural prestige occurs in families with high social status). • A negative coefficient points toward a negative or inverse relation: a high score on one characteristic combines with a low score on the other (e.g., high structural prestige is found predominantly with low social status families).
Correlation • There is no correlation if the absolute value of the coefficient is less than (+/-)0.05. • If the absolute value of a coefficient is between 0.05 and 0.25 (and from -0.05 to-0.25), association is weak: Positive, Negative • Coefficients from 0.25 to 0.60 (and from −0.25 to −0.60) indicate moderate association: Positive and Negative • Coefficients from 0.60 to 1.00 (or −0.60 to −1.00) is interpreted as strong association : Positive and Negative • Coefficient of 1 or −1 is said to display perfect association : Positive and Negative
Domains • Popularity is a very restricted measure of prestige because it takes only direct choices into account. • This is the input domain of an actor, which has been called the influence domain because structurally prestigious people are thought to influence people who regard them as their leaders. • The larger the input domain of a person, the higher his or her structural prestige. • The output domain is more likely to reflect prestige in the case of a relation such as “lend money to”.
Proximity Prestige • Limit the input domain to direct neighbors or to neighbors at maximum distance two on the assumption that nominations by close neighbors are more important than nominations by distant neighbors. • An indirect choice contributes less to prestige if it is mediated by a longer chain of intermediaries.
Proximity Prestige • Proximity Prestige: This index of prestige considers all vertices within the input domain of a vertex but it attaches more importance to a nomination if it is expressed by a closer neighbor. • A nomination by a close neighbor contributes more to the proximity prestige of an actor than a nomination by a distant neighbor, but many “distant nominations” may contribute as much as one “close nomination.”
Proximity Prestige • To allow direct choices to contribute more to the prestige of a vertex than indirect choices, proximity prestige weights each choice by its path distance to the vertex. • A higher distance yields a lower contribution to the proximity prestige of a vertex, but each choice contributes something.
Proximity Prestige • A larger input domain (larger numerator) yields a higher proximity prestige because more vertices are choosing an actor directly or indirectly. • A smaller average distance (smaller denominator) yields a higher proximity prestige score because there are more nominations by close neighbors. • Maximum proximity prestige is achieved if a vertex is directly chosen by all other vertices. • The proportion of vertices in the input domain is 1 and the mean distance from these vertices is 1, so proximity prestige is 1 dividedby 1. • Vertices without input domain get minimum proximity prestige by definition, which is zero.
Proximity Prestige • All vertices at the extremes of the network (v2, v4, v5, v6,and v10) have empty input domains, hence they have a proximity score of zero. • The input domain of vertex v9 contains vertex v10 only, so its input domain size is 1 out of 9 (.11). • Average distance within the input domain of vertex v9 is one, so the proximity prestige of vertex 9 is .11 divided by 1 = 0.11. • Vertex v1 has a maximal input domain (9 out of 9 =1), • Average distance is 2.0, so proximity prestige amounts to 1.00 divided by 2.0, which is .5. • Avg.dist. = (4+3+2+1+1+1+2+2+2)/9 = 2
References • Wouter de Nooy, Andrej Mrvar, and Vladimir Batagelj, Exploratory Social Network Analysis with Pajek, Cambridge