330 likes | 507 Views
The Complexity of Matrix Completion. Nick Harvey David Karger Sergey Yekhanin. Good!. x = 1 , y = 0. 1 1. 1 0. x = 1 , y = 1. 1 1. Bad. 1 x. 1 x. 1 1. 1 y. 1 y. What is matrix completion?.
E N D
The Complexityof Matrix Completion Nick Harvey David Karger Sergey Yekhanin
Good! x=1, y=0 1 1 1 0 x=1, y=1 1 1 Bad 1x 1x 1 1 1y 1y What is matrix completion? • Given matrix containing variables, substitute values for the variables to get full rank
Why should I care? Combinatorics • Many combinatorial problems relate to matrices of variables Relation to Algebra Problem Tutte ’47, Edmonds ’67, Lovasz ’79 Graph Matching Tomizawa-Iri ’74, Murota ’00 Matroid Intersection God (i.e., the BOOK) Counting paths in DAG Gessel-Viennot ’85
Why should I care? Algorithms • Often yields highly efficient algorithms Algorithms Problem RNC: KUW’86, MVV’87 Sequential O(n2.38) time: MS’04, H’06 Graph Matching O(nr1.38) time: H’06 Matroid Intersection Random Network Codes:Koetter-Medard ’03,Ho et al. ’03 Counting paths in DAG
Why should I care? Complexity • Depending on parameters, can beNP-complete, in RP, or in P • Key parameters:Field size, # variables,# occurrences of each variable • Contains polynomial identity testing as special case (Valiant ’79) • Derandomizing PIT implies strong circuit lower bounds (Kabanets-Impagliazzo ’03)
Field Size Why care about field size? • Relevant to complexity:random works over large fields • Understanding smaller fields may provide insight to derandomization • Important for network coding efficiency(i.e., complexity of routers)
? ? ? ? ? ? P Complexity Regions NP Hard 9 Lovasz ‘79 8 Buss et al. ‘99 7 6 RP # Occurences of an variable 5 4 3 Geelen ‘99 2 P 1 H., Karger,Murota ‘05 2 22 3 5 7 n+1 Field Size
P Complexity Regions This Paper NP Hard 9 8 NP Hard 7 6 RP # Occurences of an variable 5 4 3 2 P 1 2 22 3 5 7 n+1 Field Size
Variant:Simultaneous Completion • We have set of matrices A := {A1, …, Ad} • Each variable appears at most once per matrix • An variable can appear in several matrices Def: A simultaneous completion for Aassigns values to variables whilepreserving the rank of all matrices • RP algorithm still works over large field • Application to Network Coding usesSimultaneous Completion
1 A 1 A 1 B B C D E C D Relationship to Single Matrix Completion • Hardness for SimultaneousCompletion Hardness for SingleMatrix Completionw/many occurrences of variables Simultaneous Completion Single Matrix Completion
Non-trivial!Murota ’93. Simultaneous Completion Algorithm • Simple self-reducibility algorithm • Operates over field Fq, where d := # matrices < q Input: d matrices Compute rank of all matrices Pick an variable x for i {0,…,d} • Set x := i • If all matrices have unchanged rank • Recurse (# variables has decreased)
A Sharp Threshold • Simple self-reducibility algorithm • Operates over field Fq, where d := # matrices < q Thm: Simultaneous completion for dmatrices over Fq is: • in P if q > d[HKM ’05] • NP-hard if q ≤ d[This paper]
A Sharp Threshold Thm: Simultaneous completion for dmatrices over Fq is: • in P if q > d[HKM ’05] • NP-hard if q ≤ d[This paper] Cor: Single matrix completion with d occurrences of variables over Fqis NP-hard ifq ≤ d
(if A, B, C {0, 1}) C = 1 - A∙B 1 A det 0 (if A, B, C {0, 1}) B C Approach • Reduction from Circuit-SAT A C NAND B C = ( AB )
What have we shown so far? • Simultaneous completion of an unbounded number of matrices over F2 is NP-hard • Can we use fewer? • Combine small matrices into huge matrix? • Problem: Variables appear too many times • Need to somehow make “copies” of a variable • Coming up next: • completing two matrices over F2 is NP-hard
A Curious Matrix Rn :=
A Curious Matrix Thm: det Rn = Rn :=
Linearity of Determinant det + = det det
Column Expansion + det det = = (-1)n+1 det
Schur Complement Identity det - ∙ ∙ = det 1
Applying Outer Product - = det ∙ ∙ 1 = det
Finishing up = det = QED
xi xi. i i Replicating Variables Corollary: If {x1, x2, …, xn} in {0,1} then det Rn 0 xi = xji,j Proof: det Rn = , which is arithmetization of So either all variables true, or all false.
Replicating Variables Corollary: If {x1, x2, …, xn} in {0,1} then det Rn 0 xi = xji,j Consequence: over F2, need only 2 matrices NAND Rn A := B := NAND Rn NAND Rn
What have we shown so far? Simultaneous completion of: • an unbounded number of matricesover F2 is NP-hard • two matrices over F2 is NP-hard Next: • q matrices over Fq is NP-hard
1 1 det 0 etc. x(i) x(j) Handling Fields Fq • Previous gadgets only work if each x {0,1}.How can we ensure this over Fq? • Introduce q-2 auxiliary variables: x=x(1), x(2), …, x(q-1) • Sufficient to enforce that:x(i) x(j) i,j and x(i) {0,1} i 2
Handling Fields Fq x(i) x(j) i,j and x(i) {0,1} i 2 0 1 x(1) x(q-1) x(2) x(4) x(3) Edge indicates endpoints non-equal
0 1 x(1) x(q-1) x(2) x(3) x(4) Handling Fields Fq x(i) x(j) i,j and x(i) {0,1} i 2 • Pack these constraints into few matrices • Each variable used once per matrix • Amounts to edge-coloring • From (Kn), conclude that q matrices suffice
What have we shown so far? • Simultaneous completion of: • an unbounded number of matricesover F2 is NP-hard • two matrices over F2 is NP-hard • q matrices over Fq is NP-hard
Main Results Thm: A simultaneous completion for dmatrices over Fq isNP-hard if q ≤ d Cor: Completion of single matrix, variables appearing d timesisNP-hard if q ≤ d Cor: Completion of skew-symmetric matrix, variables appearing d timesisNP-hard if q ≤ d
Open Questions • Improved hardess results / algorithmsfor matrix completion? • Lower bounds / hardness for field size in network coding? • More combinatorial uses of matrix completion