320 likes | 442 Views
OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS. Song Yang and David Knoke SOCI 5013: Advanced Social Research, Spring 2004. RESEARCH QUESTION.
E N D
OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS Song Yang and David Knoke SOCI 5013: Advanced Social Research, Spring 2004
RESEARCH QUESTION • How to identify optimal connections, that is, direct or indirect paths between dyads that permit the highest exchange volumes while taking into account the actors’ costs of interaction?
CONNECTIONS IN BINARY GRAPHS • Graph is depicted as a two dimensions by a set of nodes representing actors and a set of lines representing the direct ties between a pair (dyad) respectively. • We are concentrating on undirected, symmetric graphs that reflect mutual interactions. Marriages between persons, and contracts between corporations are two good cases in point. If A is married to B, B must be married to A as well.
BINARY GRAPHS • In binary graphs, the presence of connection between a pair of nodes is indicated by a constant value of 1. In contrast, the absence of connection is indicated by a value of 0. • In a graph, a path is a set of distinct nodes and lines that connect a specific pair of nodes. A length of a path refers to the number of lines in it. The path distance between two nodes is defined as the length of the shortest path.
BINARY GRAPHS • In binary graphs, path distance is normally used to indicate the optimal connections between a pair of nodes. This solution assumes that intermediaries are costly. If more intermediaries are necessary to connect a pair of actors, they may extract higher commissions for their services, distort the information content exchanged, and increase the time required to complete a transaction.
VALUED GRAPHS • Valued graph is defined as a graph whose lines carry numerical values indicating the intensities of the relationships between all dyads. • These numbers typically represent frequencies or durations of interactions among social actors • For example, volumes of communications, levels of friendship and trust, or dollar amounts of economic transactions.
VALUED GRAPHS • For organizations engaging in strategic alliances, a valued graph might indicate the numbers of distinct partnerships formed between each pair. • In valued graphs, using path length to indicate optimal connection is not applicable because the shortest path is less identifiable.
VALUED GRAPHS • Previous researchers propose two solutions to measure optimal connections in valued graphs. Peay (1980) states that path value, defined as the smallest value attached to any line in a path, indicates the optimal path between a pair of nodes.
Problems • The problems of Peay’s path value solutions • How to determine the path value/optimal connection when multiple paths/path values present between two nodes? • How to account for the transaction costs of exchanges involving many go-betweens?
Flament Solution • Flament (1963) uses path length, defined as the sum of the values of the lines included in a path, to represent the optimal connection between a pair of nodes.
The Problems with Flament’s Path Length Solution • No standard for which stands for optimal connection among results from Flament’s path length. whether larger or smaller path lengths are viewed as optimal for connecting dyads. • If larger values indicate optimal connection. Then a high number can result when either (1) the lines in a path have high values, or (2) a path has many lines with low values that sum to a large total. And the solution fails when the second situation occurs.
More Problems • Else if lower values represent optimal connection. Then a low number can result when either (1) the lines in a path have low values, or (2) a path has few lines that add up to a small value.
OUR SOLUTION • Bring binary distance back to the equations. We argue that including binary distance is especially crucial for measuring path strength in a valued graph because it takes into account the costs (in time, energy, or decay of information) required for indirectly connected dyads to reach one another through varying numbers of intermediaries.
OUR SOLUTION • We now formally define two measures of path strength applicable to valued graphs. A valued graph G consists of three sets of information: • A set of nodes N = {n1, n2, … ng} • A set of lines between pairs of nodes L = {l1, l2, … lg} • A set of values attached to the lines V = {v1, v2, … vg}.
OUR SOLUTION • A path between nodes ni and nj consists of a sequence of distinct lines connecting the pair through one or more intermediaries, expressed as: • {li,i+1, li+1,i+2, … lj-2,j-1, lj-1,j}, • The dual subscripts indicate the origin and terminus nodes of each line.
OUR SOLUTION • The minimum value Mij of a path between nodes ni and nj is the smallest value of any line in that path, indicated as • Mij = min (vi,i+1, vi+1,i+2, … vj-2,j-1, vj-1,j). • Notice that Mij is actually Peay’s path value.
OUR SOLUTION • The distance of that path Dij is the total number of lines where each line has a value of one, which is indicated as • Dij = (li,i+1 + li+1,i+2 … + lj-2,j-1 + lj-1,j ). • Note that this sum is identical to distance in a corresponding binary graph, obtained by counting the number of lines in a path connecting nodes ni and nj.
APV • Then, a measure of average path value (APV) between nodes ni and nj is the ratio of path value to distance, indicated by
APV • Note that a pair of nodes may have multiple paths, thus containing multiple APVs. We suggest that the highest APV indicates the optimal connection between the pair of nodes because it permits the highest volume of transactions/messages/contracts/treaties/friendships after controlling for the binary distance between the two nodes.
IMPLEMENTATION ISSUES • Unfortunately, available social network software (UCINET) does not work according to our solution. Consider the following example,
Why Differ? • How UCINET chooses a different result to represent the optimal connections? The algorithm works like this, • Find the highest path value among the multiple paths between a pair of nodes, thinking this is the optimal path. • In our example, UCINET picks 3 for the path BEDC, thinking it is the optimal path connecting the dyad BC.
Why Differ? • Then, calculating the binary distance associated with the optimal path it just picked up between the pair of nodes. • In our example, it was 3 for the path BEDC. • Dividing the highest path value by its binary distance, saying that I get the APV. In our example, it was 3/3=1.
But We Want • Finding the path values for all the paths between a dyad. • Calculating the binary distances for all the paths. • Dividing each path values by its binary distance, producing multiple APVs for a dyad. • Picking up the highest APV to represent the optimal connection between the dyad, which is 2/1=2 in our example.
Consequences • Such a difference in computing optimal connection between UCINET and our solution produces only one discrepancy in our example with five nodes and 10 dyads. 5!/3!*2!=10, which is the maximum number of dyad relationships for 5 actors.
Consequences • However, social scientists rarely deal with 5 by 5 matrix. Instead, many of the matrices contain 10s, 100s, or even 1000s of actors, forming symmetrical matrices with many dimensions. • Suppose we have a matrix with 100 actors. It can have a maximum 100!/2!*98!=4,950 dyads. If UCINET and our solution have 10% disagreement, we are expecting 495 discrepancies between UCINET output and our expected output, which is less tolerable.
POSSIBLE SOLUTION • Devise your own algorithm • Some shortest path algorithm such as Dijkstra’s algorithm or Floyd-Warshall algorithm is not sufficient but provides solid base to solve our problem. • Implement the algorithm using any languages such as C, C++, JAVA, or FORTRAN.
Solution • Yang and Hexmoor (2004) devised a suitable algorithm and implemented it with several JAVA programs • Classroom illustration of the software is pending for time permission