240 likes | 396 Views
A Similarity Skyline Approach for Handling Graph Queries - A Preliminary Report. Katia Abbaci† Allel Hadjali † Ludovic Liétard ‡ Daniel Rocacher † † IRISA/ENSSAT, University of Rennes1 {Katia.Abbaci , Allel.Hadjali, Daniel.Rocacher}@enssat.fr ‡ IRISA/IUT, University of Rennes1
E N D
A Similarity Skyline Approach for Handling GraphQueries - A Preliminary Report Katia Abbaci† Allel Hadjali† Ludovic Liétard‡ Daniel Rocacher† †IRISA/ENSSAT, University of Rennes1 {Katia.Abbaci, Allel.Hadjali, Daniel.Rocacher}@enssat.fr ‡IRISA/IUT, University of Rennes1 Ludovic.Lietard@univ-rennes1.fr
Outline • Introduction • Background: • Skyline Query • Graph Query • Graph Similarity Measures • Graph Similarity Skyline • Refinement Graph Similarity Skyline • Summary and Outlook GDM 2011
Introduction (1/3) Context: • Graphs: Modeling of structured and complex data • Application Domains: • Medicine, Web, Chemistry, Imaging, XML documents, Bioinformatic,... Chemistry Web Imaging Medicine GDM 2011
Introduction (2/3) Main: • Search Problem of similar graphs to graph query • Existing approaches: a single similarity measure • Several methods for measuring the similarity between two graphs: • Method limited to an application class • No method fits all GDM 2011
Introduction (3/3) Motivations: • Model for different classes of applications • Model incorporating multiple features Contributions: • Graph Similarity Skyline in order to answer a graph query: optimality in the sense of Pareto • A Refinement Method of Skyline based on diversity criterion among graphs GDM 2011
SkylineQuery • Identification of interesting objects from multi-dimensional dataset • p = (p1, …, pm),q = (q1, …, qm): multidimensional objects p Pareto dominatesq, denoted pq, iff: • on each dimension, 1 ≤ i ≤ m, pi ≤ qi • on at least one dimension, pj < qj GDM 2011
SampleSkylineQuery • Find a cheap hotel and as close as possible to the downtown: H2 H2 H6 H6 Skyline = {H2, H4, H6} GDM 2011 Tab. 1 –Sample of hotels
Graph Query • Twocategories of graph queries: • Graph containmentsearch: q: a query, D = {g1, …, gn} a GDB • Subgraphcontainmentsearch • Retrieve all graphs gi of D suchthatq ⊆ gi • Supergraphcontainmentsearch Retrieve all graphs gi of D suchthatq ⊇ gi • Graph similaritysearch: Retrieve structurally similar graphs to the query graph GDM 2011
Graph SimilarityMeasures • Several processing methods of graph similarity: • Edit Distance (DistEd) • Maximum common subgraph based distance (DistMcs) • Graph union based distance (DistGu) GDM 2011
Graph SimilarityMeasures Tab. 2 –SimilarityMeasures GDM 2011
Edit Distance: example • Transformation of g into g’: • deletion of the adge (d, e), • re-labeling the adge (a, d) from 1 to 4, • re-labeling the node d with e, • insertion of the adge (a, f) with the label 1. • Use of the uniform distance: f 4 4 4 4 4 e e e e e e e e e e 1 6 6 6 6 6 6 6 6 6 6 a e e e e e e e d d d 4 4 4 4 4 1 4 4 1 4 5 5 5 5 5 5 5 5 5 5 a a a a a a a a a a Fig. 3 –Example of labeled graphs Fig. 3 –Example of labeled graphs Fig. 3 –Example of labeled graphs Fig. 3 –Example of labeled graphs Fig. 3 –Example of labeled graphs 2 2 2 2 2 f f f f f f f f f 2 2 2 2 2 1 1 1 1 1 c c c c c c c c c c 3 3 3 3 3 3 3 3 3 3 a a a a a a a a a g g g g g g’ g’ g’ g’ g’ GDM 2011
Distances based on Mcs and Gu: example • Identification of the size of • Computation of Mcs-based distance: • Computation of Gu-based distance: 4 4 e e e e 6 6 6 6 e e d d 4 4 1 1 5 5 5 5 a a a a Fig. 4 –Example of labeled graphs Fig. 4 –Example of labeled graphs 2 2 f f f f 2 2 1 1 c c c c 3 3 3 3 a a a a g g g’ g’ GDM 2011
Graph Similarity Skyline (1/2) • Graph compound similarity between two graphs: a vector of local distance measures GDM 2011
Graph Similarity Skyline (2/2) • q: a query, D = {g1, …, gn} a GDB • For i = 1 ton, do: • Compare • Extract the Graph Similarity Skyline (GSS): • Similarity-Dominance Relation • ∀ i ∈ {1, ..., d}, Disti(g, q) ≤ Disti(g’, q), • ∃ k ∈ {1, ..., d}, Distk(g, q) < Distk(g’, q). GDM 2011
Illustrative Example (1/2) 1 4 1 1 1 1 1 1 1 4 e e e e e e e e e e e e e e e e e e 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 4 4 2 4 4 4 2 4 2 4 4 2 4 2 2 2 e 2 e e e e e d e e e e e e d 1 1 3 3 1 2 3 3 1 2 3 1 3 1 5 5 5 5 5 5 5 5 3 5 5 5 a a a a a a a a a a a a 5 5 3 5 f f d d f f f f d f d f 2 2 2 2 1 1 2 1 1 1 2 2 2 2 2 2 2 2 2 1 1 1 1 5 2 1 2 5 c c c c c c c c c a a c a a 3 3 3 3 3 3 3 3 3 3 3 3 3 3 f f a a a a a a a a a a a a 3 3 f f a a a a g1 g4 g2 g3 g6 q g5 g7 q g6 g5 g3 g4 g2 g1 g7 Fig. 6– Graph databaseD and graph queryq Fig. 6 – Graph databaseD and graph queryq Tab. 3 – Information about |Mcs(gi, q)| c c c c c c GDM 2011
Illustrative Example (2/2) • Computation of GCS(gi,q), for i= 1 to 7, do: g1 g5 g1 Tab. 4 – Distance Measures GSS(D, q) = {g1, g4, g5, g7} GDM 2011
Refinement of Graph Similarity Skyline (1/3) • Large Skyline • Need k dissimilar answers • Solution: diversity criterion • Extract a subset (S) of size k with a maximal diversity Provide the user with a global picture of the whole set GSS GDM 2011
Refinement of Graph Similarity Skyline (2/3) • Diversity of a subsetS of size kis: : diversity in the ith dimension of the subsetS s. t.: GDM 2011
Refinement of Graph Similarity Skyline (3/3) • RefinementAlgorithm: • For j = 1 to , enumerate , with • For i = 1 to d, rank-order all Sj in decreasingwayaccording to theirdiversity Let be the rank of Sj w. r. t. the ith dimension: • : the best diversity value • : the worstdiversity value • EvaluateSj by: • Extract : GDM 2011
1 1 e e Illustrative Example 6 6 2 4 2 e e 1 1 3 5 5 2 2 1 1 • Return the 2 best graphs: a 3 a 3 a f a f g5 g7 f 4 e e 6 6 e d 4 1 2 Fig. 8 – The skyline GSS c c 5 5 a a 2 f 1 c c 3 3 a a g4 g1 GDM 2011
Summary and Outlook • Skyline approach for searching graphs by similarity • Extraction of all DB graphs non-dominated by any other graph • Preserving information about the similarity on different features • Selection of the subset of graphs with maximal diversity from the skyline • Implementation: step to demonstrate the effectiveness of the approach on a real database • Investigation of other similarity measures GDM 2011
Thankyou Questions ?