130 likes | 150 Views
This lecture notes cover deterministic and probabilistic approaches for Parallel Minimum Spanning Tree algorithms to efficiently construct spanning trees in communication networks. Concepts include shared memory, circuit models, connected components, and Brent's Theorem. The importance of uniform circuits in language complexity and the relationship between SC and NC classes are also explored. Detailed explanations and proofs are provided for a comprehensive understanding of these parallel computation techniques.
E N D
Theory Moon Jung Chung CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Minimum Spanning Tree (Deterministics) Each node is a super node. Repeat until only one super node For each super node, among edges which connects to another super node, select an edge with minimum merge two super nodes into one super node How many phase? --> O(logn) phase. each phase: O(logn) time in CRCW. Actually, with priority CRCW, O(1) time. Complexity: O(logn) time with O(m) PEs with priority CRCW, where m is the number of edges. CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Minimum Spanning Tree (Deterministic: detailed) Repeat until there is only one super node, for each edge (x,y), if x and y are different component, component (x) = y component (y) = x For each node with priority -CW, accept the minimum value of component. Merge two super nodes into a single super node. Complexity: O(logn) time with O(m) PEs with priority CRCW, where m is the number of edges. How to avoid priority-CR? ==> If tree is a spanning tree? CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Spanning Tree (Probablistic) For each edge, if it connects two different super nodes, add the edge in a spanning tree, and merge two super node as a single node. For two super nodes, two edges may be selected at the same time connecting them. How about cycle? To prevent these troubles, For each super node, select an edge randomly which connects to other super node. Verify if two different super nodes selected the same edge Verify if there is no cycles If the selected edge is OK, include the edge in a spanning tree, and merge two super nodes. How many phase? --> O(logn) phase in average. each phase: O(1) time in average Complexity: O(logn) time with O(m) CREW PEs. Parallel Connected Components in EREW => Use matrix multiplication: O(log2n) time using O(n2) PEs. CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Models (i) Shared Memory (PRAM) -- deterministic how about probablistic? example minimum spanning tree (ii) Circuit: depth and size (iii) Alternating Turing Machine Brent Theorem: Any depth-d size-n combinational circuit with bounded fan-in can be simulated by p-processor CREW algorithm in O(n/p + d) time. proof: store inputs to the combinational circuit in the PRAM Each gate evaluate its output if all inputs are ready. If there are not enough PEs, evaluate gates in the order of depth. (depth of a gate: longest path from the primary inputs) Complexity: Let ni be number of gates at depth i. The simulation takes ni/p for the gates at the depth i total time: sum of i ni/p i ( ni/p + 1) = n/p + d. CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Models Brent Theorem for EREW: Any depth-d size-n combinational circuit with bounded fan-in, fan-out can be simulated by p-processor EREW algorithm in O(n/p + d) time. proof: For exclusive reading, output values are copied to all gates where it is used. With bounded fan-in, fan-out, it takes constant time. Reading them one by one also takes constant time. CSE838 Lecture notes copy right: Moon Jung Chung
yes no Uniform Circuit L be a language. Circuit complexity of L? Definition1: f(n) = number of gates of a circuit accepting strings of length in L. Def. 1 may not be acceptable one: L = {0n| n-th TM accepts n-th input} L is not even recursively enumerable. But L has circuit complexity 1 two candidate circuits accepting a string of length. CSE838 Lecture notes copy right: Moon Jung Chung
Uniform Circuit Let Ln = {w | w is in L and |w| = n} There is a family of circuits {Cn}, and generating Cn can be done using polynomial time using O(logn) space. Each gate has bounded fan-in degree. Example of non-uniform: Division circuit => O(logn) time, but generation of it will require polynomial size space! NCk = {L | there is a uniform circuit of poly size and (logn)k depth} NC = k NCk Note: SCk = {L | there is a TM with time poly and (logn)k space} Relationship between SC and NC? CSE838 Lecture notes copy right: Moon Jung Chung
Alternating TM TM forks at each state. Subprocesses cannot communicate! TM has two types of states: universal existential At Universal: all branches must be accepted. Existential: one branch should lead to accepting state That is, each computation can be represented as a computation tree. Depth of computation tree: time complexity. Note: Deterministic TM: a path Parallel random access machine: processes can communicate. ASPACE (logn) = P CSE838 Lecture notes copy right: Moon Jung Chung
Parallel Computation Thesis parallel computation thesis: parallel time is polynomially equivalent to sequential space. example: parallel time of vector machine is equivalent to sequential space ATIME (f(n)) and DSPACE (f(n)). ATM (S(n), T(n)): Language accepted by ATM with space S(n), time T(n). Theorem: ATM (logn, (logn)k) = NCk CSE838 Lecture notes copy right: Moon Jung Chung
NC-algorithm and P-complete problems Let f be a function Input: input of f Output: compute f NC1 reducible from f to g: using oracles of g, we can construct NC1 circuits computing f. oracle gate is counted as depth logn, size n. Division: input: x and y output: x/y Reciprocal: input: x output: x-1 Powering: Input: x Output: xi expressed in n2 bits Example: Division < Reciprocal Using reciprocal, compute y-1 compute x*y-1 Reciprocal < Powering: trivial How to construct log depth powering circuit? ==> seems not easy CSE838 Lecture notes copy right: Moon Jung Chung
NC-algorithm and P-complete problems Special case of function: language recognition. Let A, B be languages A is NC1 reducible to B, if using oracle, A can be solved in NC1. log space reduction is NC reduction. A NCB, and B is in NC, then A is also in NC. A is complete for P <=> for any B in P, A <logn B. Theorem: Let A be a P-complete problem (with respect to log space reduction). If A is in NC, then P NC. CSE838 Lecture notes copy right: Moon Jung Chung
P-complete and hard problems to make parallel Examples of P-complete problems: Monotone Circuit Value problem: Input: Monotone circuit (and, or gates, but without “not” gate, and values to primary input. Question: Is the output of circuit 0 with the given primary input values? Generating lexiographically smallest depth first search tree These P-complete problems may not be parallelizable! Open question: Perfect matching, depth first search (directed, undirected), integer GCD, modular exponentiation CSE838 Lecture notes copy right: Moon Jung Chung