350 likes | 484 Views
Database Research Group. Interactive SQL Query Suggestion. Making Databases User-Friendly. Ju Fan , Guoliang Li, and Lizhu Zhou Database Research Group, Tsinghua University ICDE 2011 – Apr. 13, Hanover. Outline. Motivation Overview of SQL Query Suggestion Template Suggestion
E N D
Database Research Group Interactive SQL Query Suggestion Making Databases User-Friendly Ju Fan, Guoliang Li, and Lizhu Zhou Database Research Group, Tsinghua University ICDE 2011 – Apr. 13, Hanover
Outline • Motivation • Overview of SQL Query Suggestion • Template Suggestion • SQL Generation • Experiments • Conclusion SQL Suggestion, ICDE 2011
Outline • Motivation • Overview of SQL Query Suggestion • Template Suggestion • SQL Generation • Experiments • Conclusion SQL Suggestion, ICDE 2011
SQL: Powerful Yet Difficult • SQL is powerful but difficult for inexperienced users to pose queries • Be skillful in SQL syntax to express query intent • Have a thorough understanding of the schema SQL Suggestion, ICDE 2011
SQL Assistant Tools • Target Users • The novice users who struggle with the basic SQL syntax or the structure of the schema. • Limitations • Only support metadata and SQL syntax • Require users to manually join multi-tables SQL Suggestion, ICDE 2011
Keyword Search over RDB • Keyword Search over Relational DB • Data: A database with multiple tables • Query: Keywords • Answer: Joined tuples containing the keywords • Limitations • Cannot precisely capture users’ query intent • May involve irrelevant results • Cannot support aggregate functions, range queries, etc. SQL Suggestion, ICDE 2011
SQL Suggestion from Keywords SQL Suggestion, ICDE 2011
Features of SQL Suggestion • Objective: Assist users to formulate SQL queries using keywords • Main Features • Query intent prediction • Answer grouping • Aggregation queries • Range queries SQL Suggestion, ICDE 2011
Comparison of Query Paradigms Easier Keyword Search SQL Suggestion Usability SQL Expressiveness More Powerful SQL Suggestion, ICDE 2011
Outline • Motivation • Overview of SQL Query Suggestion • Template Suggestion • SQL Generation • Experiments • Conclusion SQL Suggestion, ICDE 2011
Problem Definition Query: Keywords User Data: A Database with Multiple Tables Answer: SQL Queries … SQL Suggestion, ICDE 2011
A Two-Step Framework One of Relevant Templates Step 1 Template Suggestion “count paper ir” User One of Generated SQL Queries Step 2 SQL Generation SELECTCOUNT (P.id) FROM Paper P, Author A, Write W WHERE A.nameCONTAINS “ir” AND A.id = W.aidAND P.id = W.pid SQL Suggestion, ICDE 2011
Outline • Motivation • Overview of SQL Query Suggestion • Template Suggestion • SQL Generation • Experiments • Conclusion SQL Suggestion, ICDE 2011
Template Suggestion One of Relevant Templates Step 1 Template Suggestion “count paper ir” User One of Generated SQL Queries b Step 2 SQL Generation SELECTCOUNT (P.id) FROM Paper P, Author A, Write W WHERE A.nameCONTAINS “ir” AND A.id = W.aidAND P.id = W.pid SQL Suggestion, ICDE 2011
Queryable Template • The skeleton of SQL queries that models the joined entities and their attributes. • A template is an undirected graph SQL Suggestion, ICDE 2011
Template Generation • Atom Entities • E.g., Paper • Expansion Rules • E.g., P – W P – W – A • Combinatory Explosion • A ranking model for avoiding exploring all templates SQL Suggestion, ICDE 2011
Template Ranking Model = ∑k∈Q P(Q,T) P(T) ∑R∈T P(k|R) P(R|T) Query KeywordsQ Keyword2 … Keywordn Keyword1 P(k|R):Relevance of Rtok(TF-IDF) Entities in template T Paper Write Author P(T) Query Ability of T P(R|T):Importance ofRtoT:(PageRank) SQL Suggestion, ICDE 2011
Top-k Suggestion Algorithm wR = ∑k∈Q P(Q,T) P(T) ∑R∈T P(k|R) P(R|T) • Fagin Algorithm [5] • Lists of templates for all entities ordered by P(R|T) • Indexing • Inverted Index: • Entity-to-Template • Forward Index: • Template-to-Entity wp* wA* wW* P(P|T) P(A|T) P(W|T) SQL Suggestion, ICDE 2011
Outline • Motivation • Overview of SQL Query Suggestion • Template Suggestion • SQL Generation • Experiments • Conclusion SQL Suggestion, ICDE 2011
SQL Generation One of Relevant Templates Step 1 Template Suggestion “count paper ir” User One of Generated SQL Queries Step 2 SQL Generation SELECTCOUNT (P.id) FROM Paper P, Author A, Write W WHERE A.nameCONTAINS “ir” AND A.id = W.aidAND P.id = W.pid SQL Suggestion, ICDE 2011
Match Keywords to Attributes database count author A Matching Keyword-to-Attribute Mapping σ SELECTION Projection Φ Φ ,σ Aggregation Φ π Φ Φ Φ σ id id name title booktitle year Paper Author SQL Suggestion, ICDE 2011
SQL Generation Model ∑m∈M ρ(k,A) I(A) S(M)= database count author A Matching σ π ρ(k,A) the degree of a mapping Φ I(A): the importance of mapped attributes (Entropy) id id name title booktitle year Paper Author SQL Suggestion, ICDE 2011
Best SQL Query Generation ∑m∈M ρ(k,A) I(A) S(M)= • Optimization Problem • Weighted Set Covering Problem (NP-hard) • A greedy approximation algorithm • Extensions • Find Top-k matchings MAX. SQL Suggestion, ICDE 2011
Outline • Motivation • Overview of SQL Query Suggestion • Queryable Template Suggestion • SQL Generation from Templates • Experiments • Conclusion SQL Suggestion, ICDE 2011
Experiment Setup • Data sets • DBLP: More than one million publication records • DBLIFE: Activity information of people in DB comm. • Query sets, E.g., • count author mining (DBLP) • database jim gray (DBLIFE) • Baseline method: DISCOVER-II • User-Study for effectiveness evaluation SQL Suggestion, ICDE 2011
Template Suggestion Precision-Recall Curves on the DBLife data set SQL Suggestion, ICDE 2011
SQL Generation Query: database Jim Gray Precisions on the DBLife data set SQL Suggestion, ICDE 2011
Record Retrieval Query: count author mining • Advantages of SQL Suggestion • Support aggregation functions • Support meta-data matching Precisions on the DBLife data set SQL Suggestion, ICDE 2011
Efficiency Efficiency Comparison (DBLife) SQL Suggestion, ICDE 2011
Scalability Scalability (DBLP) SQL Suggestion, ICDE 2011
Outline • Motivation • Overview of SQL Query Suggestion • Queryable Template Suggestion • SQL Generation from Templates • Experiments • Conclusion SQL Suggestion, ICDE 2011
Conclusion • An effective and user-friendly keyword-based method • Assist users to formulate SQL queries • Suggest templates relevant to keyword queries • Generate SQL queries from templates • Extensive experiments SQL Suggestion, ICDE 2011
Future Work • This study opens many new interesting and challenging problems • Cardinality estimation of suggested SQL queries • Personalized SQL suggestion SQL Suggestion, ICDE 2011
Thanks Demo: http://tastier.cs.tsinghua.edu.cn/sqlsugg My Homepage: http://dbgroup.cs.tsinghua.edu/fanju SQL Suggestion, ICDE 2011
Comparison with Existing Work • CN-Based Methods • Better template ranking • SQL Generation • Aggregation functions, range queries, etc. SQL Suggestion, ICDE 2011