200 likes | 362 Views
Graph Matching. Simulation based approach Shang Zechao 1010161920. Introduction. What is graph matching? When the one graph matches with another?. Introduction (cont.). Graph: G=(V, E). G Q = (V Q , E Q ) Can be easily extended with labels. Exact matching: isomorphism
E N D
Graph Matching Simulation based approach Shang Zechao 1010161920
Introduction • What is graph matching? • When the one graph matches with another?
Introduction (cont.) • Graph: G=(V, E). GQ = (VQ, EQ) • Can be easily extended with labels. • Exact matching: isomorphism • Find a bijection function f between V and VQ • (u, v) in E iff (f(u), f(v)) in EQ
Introduction (cont.) • Graph isomorphism • GI class • Sub-graph isomorphism • NP-Complete • Too hard!
Simulation based approach [Henzinger95] • Find a relation S: V x VQ • (u, u’) in S if • u and u’ has same labels • for all children v’ of u’, there exists v • V is child of u • (v, v’) in S
Simulation based approach • The major difference between graph simulation and graph isomorphism • Isomorphism requires an bijection (one to one) function • Graph simulation based on relation (many to many) • Simulation is in polynomial time
An Example [Fan10] • Drug dealer network • B: Boss • S: Secretary • AM: Assistant manager • FW: Field worker
An Example (cont.) • In real world • S and AM is same • AM maps to multiple worker
Bounded Simulation [Fan10] • Each edge in pattern graph has label • Either a positive integer K • Or * (infinite) • The length of path connects these two nodes
The Example (cont.) • AM should be able to reach FW within 3 hops.
Matching Algorithm • Similar with the EffcientSimilarity algorithm in [Henzinger95]. • Pre-compute the distance matrix between all pairs of node in G. • Complexity O(|V||E| + |Ep||V|2 + |Vp||V|)
Strong Simulation [Ma12] • Recall the condition that two nodes match: • Have same label • Children could be matched by simulation • Two issues • Parent information is not captured • Matching size is not limited
An Example [Ma12] • Bio can match to Bio1, Bio2, Bio3, Bio4 • Actually only Bio4 makes sense
Strong Simulation • two nodes match if: • Have same label • Children could be matched by simulation • Parent could be matched by simulation • The matched sub-graph should have same diameter as pattern graph
An Example (cont.) • Bio only matches to Bio4 in strong simulation
But • Bounded cycle problem is intractable • NP-hard • Bisimilar problem is intractable • coNP-hard
References • [Henzinger95] M. R. Henzinger, T. A. Henzinger, and P. W. Kopke. 1995. Computing simulations on finite and infinite graphs. In Proceedings of the 36th Annual Symposium on Foundations of Computer Science (FOCS '95). IEEE Computer Society, Washington, DC, USA, 453-. • [Fan10] Wenfei Fan, Jianzhong Li, Shuai Ma, Nan Tang, Yinghui Wu, and Yunpeng Wu. 2010. Graph pattern matching: from intractable to polynomial time. Proc. VLDB Endow. 3, 1-2 (September 2010), 264-275. • [Ma12] Shuai Ma, Yang Cao, Wenfei Fan, Jinpeng Huai , Tianyu Wo. 2012. Capturing Topology in Graph Pattern Matching. PVLDB. To appear.