320 likes | 422 Views
Testing Metric Properties. Michal Parnas and Dana Ron. ?. ?. ?. ?. ?. Task should be performed by querying the object (in as few places as possible). Property Testing (Informal Definition). For a fixed property P and any object O , determine whether O has property P ,
E N D
Testing Metric Properties Michal Parnas and Dana Ron
? ? ? ? ? Task should be performed by querying the object (in as few places as possible). Property Testing (Informal Definition) For a fixed property Pand any object O, determine whether O has property P, or whether O is farfrom having propertyP(i.e., far from any other object having P ).
Property Testing - Background • Initially defined by Rubinfeld and Sudan in the context of Program Testing (of algebraic functions). • Goldreich Goldwasser and Ron initiated study of testing properties of (undirected) graphs. • Growing body of work deals with properties offunctions, graphs, strings, sets of points ... Many algorithms with complexity that is sub-linear in (or even independent of) size of object.
Motivation • Computational: Design testing algorithms that are (much) more efficient than exact decision algorithms for properties. • Combinatorial:Gain new understanding about tested property.
Testing Metric Properties P - Metric property ; M - n x n rational-valued matrix; e - Distance/approximation parameter; M is said to be e-far from property P if must modify more thane fraction of n2 entries so that M obtains P. Otherwise say that it ise-close. Testing algorithm can query M on entries M[i,j]. If M has property P, should accept; If M is e-far from property P, should reject w.p. 2/3.
Tree Metrics and Ultametrics An n x n matrix M is a tree metric (additive metric) if exists a tree T with positive weights on edges, such that: • There exists a mapping ffrom [n] into nodes of T; • For every i,j[n]={1,…,n}, T(f(i),f(j))=M[i,j]; • All nodes to which no i[n] is mapped to, have degree greater than 2. If:T is rooted, f maps only to leaves of T, and distance of all leaves to root is the same, then M is an ultrametric.
1 3 5 2 3 5 M[1,2]=8;M[1,3]=12;M[1,4]=10;M[1,5]=15; . . . 7 5 4 4 5 3 2 6 7 Tree Metric M[1,2]=M[1,3]=M[2,3]=8;M[1,4]=M[1,5]=M[1,6]=12;M[4,5]=M[4,6]=6;M[5,6]=2; . . . 2 3 2 3 4 4 4 1 1 1 2 3 4 5 6 Ultrametric
Our Results Our algorithms all work by taking uniformly selected sampleS[n] and querying M[i,j] for i,j S. Size of sample is always poly(1/e) and independent of n. Specifically: • Can test ultrametrics with |S|= O(log(1/e)/e3). • Can test general tree metrics with |S|=O(log(1/e)/e3). • Can extend result for ultrametrics to approximate ultrametrics. • Can test d-dimensional Euclideanmetrics with |S|=O(d log d/e).
Our Results (continued) Testing algorithms can be used to solve relaxed versions of corresponding search problems in time linear in n (and polynomial in 1/e). That is, can construct tree that agrees with M on all but at most e-fraction of entries. (Note that running time is sub-linear in size of matrix M.)
Constructing an Ultrametric Tree Suppose M is an ultrametric. We can construct an ultrametric tree that agrees with M on given subset {1,…,s} in following manner: • Initialization: Position points 1 and 2 at equal distanceM[1,2]/2 from root node. • Iterations: For each point j = 3,…,s add point j to current tree by adding new branch that emits from j’s unique point of departure from tree. This point is determined by closest point in tree.
4 4 1 2 3 5 4 M[1,2]=8; M[1,3]=M[1,4]=M[1,5]=10;M[2,3]=M[2,4]=M[2,5]=10;M[3,4]=2; M[3,5]=6;M[4,5]=6; 1 3 5 1 2 1 1
Consistency of points with tree For U[n] , let TUdenote tree with leaf-set U, that agrees with M on U (if exists, such tree is unique). Def: Say that j [n] \ U is consistent with TUif adding j to TU as described in construction procedure, results in tree that agrees with Mon U+j.Denote set of points consistent with U by GU.
The “Scaffold Partition” For U[n] , let TUdenote tree with leaf-set U, that agrees with M on U. We refer to tree as scaffold. Def: Let PU be following partitionof GU, induced by TU: Points i and j are in same class i.f.f have same point of departure from TU.
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4 The scaffold partition
Violating Pairs If M is an ultrametric, then for every subset U, and for every two pointsi,j that belong to different classes in PU, value of M[i,j] is exactlydetermined by corresponding (different) departure points in TU. Def: Say that i,j GU that belong to different classes in PU are a violating pair w.r.t. TU if distance between them according to scaffold TUdiffers from M[i,j] .
1 1 1 1 2 2 3 3 2 3 2 1 1 C1 C2 C3 C4 i j If M is ultrametric, must haveM[i,j]=8.
Two types of “witnesses” Suppose have scaffold tree TUthat agrees with M on U. (If can’t construct such tree, clearly M not ultrametric.) It follows that: • If obtain point j that is inconsistent withTUthen have witness that M not ultrametric. • If obtain pair of points i,j that are violating w.r.t.TUthen have witness that M not ultrametric.
Testing Algorithm for Ultrametrics 1. Uniformly select s=O(log(1/e)/e3) points from [n]. Denote set by U. 2. Construct tree TUthat agrees with M on U. If fail, reject. 3. Uniformly select m=O(1/e) pairs of points from [n]. 4. If any of these 2m points is inconsistent with TU, or any of the m pairs is violating w.r.t. TU, thenreject. 5. If no step cause rejection then accept.
Analysis of Algorithm • If Misultrametric -- Algorithm always accepts.(No inconsistent points and no violating pairs.) • From now on assume M is e-far from ultrametric. Will show that algorithm rejects w.h.p. Specifically: Either can’t construct TU that agrees with M;or many inconsistent points w.r.t. TU;or many violating pairs w.r.t. TU;
Special Case (for Me-far from ultrametric) Suppose TU agrees with M, and all but at most (e/3)n2pairs of points in GUbelong to different classes in PU (are separated). (In particular is the case if all classes of size O(e n).) Claim: Either have >(e/3)ninconsistent points w.r.t. TUor have >(e/3)n2violating pairs w.r.t TU. Subject to claim, if M is e-far from ultrametric, then rejected w.h.p. as required.
Proof of Claim for special case Assume, contrary to claim, that have (e/3)ninconsistent points, and (e/3)n2 violating pairs. Will show that ultrametrictreeT that agrees with M on all but at most en2 entries, in contradiction to assumption on M. Tree Tbuilds on scaffold TU:For every class C in PU create star-shaped sub-tree with leaf set C that is rooted at point of departure of C from TU.Inconsistent points are added arbitrarily. By premise of lemma and (counter) assumptions, num of disagreements (e/3)n .n + (e/3)n2 + (e/3)n2 = en2 . incon. pts viol. Pairs unsep. pairs
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
General Case By special case: Gain from separating points to diff classes. Def: Say that point kU is effective separator w.r.t. TU if adding k to U causes (e n/12)2pairs of points to be separated into different classes. C1 C2 C3 C4 C1,1 C1,2 k
C2 C3 C4 C1,1 C1,2 k General Case By special case: Gain from separating points to diff classes. Def: Say that point kU is effective separator w.r.t. TU if adding k to U causes (e n/12)2pairs of points to be separated into different classes.
General Case (continued) In analysis, view sample U as being selected in phases. In each phase, if many effective separators then one selected w.h.p. After sufficient num of phases, either have special case (few non-separated pairs), or U s.t. have few effective separators w.r.t. TU . In latter case can show that class C in PU,tree TC s.t. for almost all pairs i,jC, M[i,j]= TC(i,j). (Tree is star-shaped/broom-shaped.)
General Case (continued) Claim: Either have >(e/4)ninconsistent points w.r.t. TUor have >(e/4)n2violating pairs w.r.t TU. Subject to claim, if M is e-far from ultrametric, then rejected w.h.p. as required. Proof of Claim is similar to that in special case: Assume few inconsistent points and violating pairs, show that tree close to M (contradicting M beinge-far from ultrametric).
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
1 1 1 1 2 2 3 3 2 1 1 C1 C2 C3 C4
Solving Relaxed version of Search Problem Analysis implies that testing algorithm can be used to solve relaxed version of corresponding search problem.That is, if M is ultrametric then, w.h.p. can construct tree that agrees with M on all but at most e-fraction of entries in time linear in n and polynomial in 1/e: • Construct scaffoldTU on uniformly selected sample U; • Partition all points in [n]\U into classes of PUaccording to distances to points in U; • For each class C construct star/broom-shaped tree TC.
Testing Approximate Ultrametrics Def: For a given approximation parameterd, we say that matrix M is a d-approximate ultrametric if exists ultrametric M’s.t. for every i,j[n], |M[i,j]-M’[i,j]| d. We describe an algorithm, that for every d ande, if M is a d–approximateultrametric then algorithm acceptsM, and if M is e–far from being a cd–approximate ultrametric then algorithm rejectsM w.h.p. (c is a fixed constant).
Conclusions and Further Research • Presented algorithm for testing whether matrix is an ultrametricor far from being an ultrametric. Analysis implies fast solution for relaxed search problem. • Mentioned similar results for approximate ultrametrics, general tree metrics and Euclidean metrics. • We suspect that results can be improved in terms of dependence on 1/e. • We conjecture that can extend result for general tree metrics to approximate variant. • Testing other natural metric properties?