250 likes | 355 Views
OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS. Yang, Song and David Knoke RESEARCH QUESTION: How to identify optimal connections, that is, direct or indirect paths between dyads that permit the highest exchange volumes while taking into account the actors’ costs of interaction?.
E N D
OPTIMAL CONNECTIONS: STRENGTH AND DISTANCE IN VALUED GRAPHS • Yang, Song and David Knoke • RESEARCH QUESTION: • How to identify optimal connections, that is, direct or indirect paths between dyads that permit the highest exchange volumes while taking into account the actors’ costs of interaction?
Binary • CONNECTIONS IN BINARY GRAPHS • Graph is depicted as a two dimensions by a set of nodes representing actors and a set of lines representing the direct ties between a pair (dyad) respectively. • We are concentrating on undirected, symmetric graphs that reflect mutual interactions. Marriages between persons, and contracts between corporations are two good cases in point. If A is married to B, B must be married to A as well.
Binary 2 • In binary graphs, the presence of connection between a pair of nodes is indicated by a constant value of 1. In contrast, the absence of connection is indicated by a value of 0. • In a graph, a path is a set of distinct nodes and lines that connect a specific pair of nodes. A length of a path refers to the number of lines in it. The path distance between two nodes is defined as the length of the shortest path.
Binary 3 • In binary graphs, path distance is normally used to indicate the optimal connections between a pair of nodes. This solution assumes that intermediaries are costly. If more intermediaries are necessary to connect a pair of actors, they may extract higher commissions for their services, distort the information content exchanged, and increase the time required to complete a transaction.
A binary graph • An Illustration
EXAMPLE FOR THE DYAD AB • PATHLENGTHOPTIMAL CONNECTION • A-B 1 1 • A-E-B 2 N/A • A-E-D-C-B 4 N/A
CONNECTIONS IN VALUED GRAPHS • Valued graph is defined as a graph whose lines carry numerical values indicating the intensities of the relationships between all dyads. These numbers typically represent frequencies or durations of interactions among social actors; for example, volumes of communications, levels of friendship and trust, or dollar amounts of economic transactions. For organizations engaging in strategic alliances, a valued graph might indicate the numbers of distinct partnerships formed between each pair.
Valued Graph • Illustration
Problems in Measuring OP in Valued Graphs • In valued graphs, using path length to indicate optimal connection is not applicable because the shortest path is less identifiable. • Previous researchers propose two solutions to measure optimal connections in valued graphs. Peay (1980) states that path value, defined as the smallest value attached to any line in a path, indicates the optimal path between a pair of nodes.
Path Valued • EXAMPLE FOR THE DYAD AB • PATHOPTIMAL CONNECTION • A-B 1 • A-E-B 3 • A-E-D-C-B 2 • This solution assumes that lower path values represent bottlenecks that impede the interactions between two nodes.
The problems of Peay’s path value solutions How to determine the path value/optimal connection when multiple paths/path values present between two nodes. How to account for the transaction costs of exchanges involving many go- betweens.
Flament’s Solution • Flament (1963) uses path length, defined as the sum of the values of the lines included in a path, to represent the optimal connection between a pair of nodes. • EXAMPLE FOR THE DYAD AB • PATHOPTIMAL CONNECTION • A-B 1 • A-E-B 6 • A-E-D-C-B 15
The Problems with Flament’s path length solution. • · No standard for which stands for optimal connection among results from Flament’s path length. whether larger or smaller path lengths are viewed as optimal for connecting dyads. • ·If larger values indicate optimal connection. Then a high number can result when either (1) the lines in a path have high values, or (2) a path has many lines with low values that sum to a large total. And the solution fails when the second situation occurs. • ·Else if lower values represent optimal connection. Then a low number can result when either (1) the lines in a path have low values, or (2) a path has few lines that add up to a small value.
OUR SOLUTION • Bring binary distance back to the equations. We argue that including binary distance is especially crucial for measuring path strength in a valued graph because it takes into account the costs (in time, energy, or decay of information) required for indirectly connected dyads to reach one another through varying numbers of intermediaries. • We now formally define two measures of path strength applicable to valued graphs. A valued graph G consists of three sets of information
Definitions • · A set of nodes N = {n1, n2, … ng} • A set of lines between pairs of nodes L = {l1, l2, … lg} • A set of values attached to the lines V = {v1, v2, … vg}. • A path between nodes ni and nj consists of a sequence of distinct lines connecting the pair through one or more intermediaries, expressed as: • · {li,i+1, li+1,i+2, … lj-2,j-1, lj-1,j},
Definitions • The dual subscripts indicate the origin and terminus nodes of each line. • The minimum value Mij of a path between nodes ni and nj is the smallest value of any line in that path, indicated as • ·Mij = min (vi,i+1, vi+1,i+2, … vj-2,j-1, vj-1,j). • Notice that Mij is actually Peay’s path value.
Solution • The distance of that path Dij is the total number of lines where each line has a value of one, which is indicated as • ·Dij = (li,i+1 + li+1,i+2 … + lj-2,j-1 + lj-1,j ). • Note that this sum is identical to distance in a corresponding binary graph, obtained by counting the number of lines in a path connecting nodes ni and nj.
Solution • Illustrate
Why differs • How UCINET chooses a different result to represent the optimal connections? The algorithm works like this, • Find the highest path value among the multiple paths Between a pair of nodes, thinking this is the optimal path. • In our example, UCINET picks 3 for the path • BEDC, thinking it is the optimal path connecting the dyad BC. • Calculating the binary distance associated with the • optimal path it just picked up between the pair of • nodes. In our example, it was 3 for the path BEDC. • Dividing the highest path value by its binary distance, saying that I get the APV. In our example, it was 3/3=1.
What We Want • We want, • Finding the path values for all the paths between a dyad. • Calculating the binary distances for all the paths. • Dividing each path values by its binary distance, • producing multiple APVs for a dyad. • Picking up the highest APV to represent the optimal connection between the dyad, which is 2/1=2 in our example.
How big of a difference • Such a difference in computing optimal connection between UCINET and our solution produces only one discrepancy in our example with five nodes and 10 dyads. C52 = 5!/3!*2!=10, which is the maximum number of dyad relationships for 10 actors.
It can be worse • However, social scientists rarely deal with 5 by 5 matrix. Instead, many of the matrices contain 10s, 100s, or even 1000s of actors, forming symmetrical matrices with many dimensions. • Suppose we have a matrix with 100 actors. It can have a maximum C1002 = 100!/2!*98!=4,950 dyads. If UCINET and our solution have 10% disagreement, we are expecting 495 discrepancies between UCINET output and our expected output, which is less tolerable.
Real Solutions • Choose a right algorithm such as Floyd-Walshall algorithm, used in computing shortest path in valued graphs. • Its implementation appears in web search of a shortest path between two locations in “mapblast” or “yahoo map.” • Implement the algorithm using any languages such as C, C++, JAVA, or FORTRAN. • Keeping track of the binary distances for each and every Paths between a pair of nodes turns out to be a difficult task. Thus, • We are waiting for a successful implementation of a right algorithm to solve our research problem.