170 likes | 301 Views
Shubha S. Suvarna (ss12an). Efficient Type-Ahead Search on Relational Data: a Tastier Approach. Guoliang Li, Shengyue Ji , Chen Li, and Jianhua Feng SIGMOD, 2009. Introduction Terminology System Architecture Indexing Structure Improving efficiency . Contents.
E N D
Shubha S. Suvarna (ss12an) Efficient Type-Ahead Search on Relational Data: a Tastier Approach Guoliang Li, ShengyueJi, Chen Li, and JianhuaFeng SIGMOD, 2009
Introduction Terminology System Architecture Indexing Structure Improving efficiency Contents
TASTIER stands for TYPE-AHEAD SEARCH TECHNIQUES IN LARGE DATA SETS. Joint research project between Tsinghua University and University of California Irvine Aim: Develop efficient type-ahead on large datasets One of the works: Efficient Type-Ahead Search on Relational Data : A TASTIER Approach introduction
Terminologies Database Graph • The database relations are represented as graph G=(V,E) where V represents the vertices and E represents the edges • The tuples in the relations constitute V • Primary key- Foreign key relations constitute E
Terminologies (Cont..) Publication Author Author-Publication Citations
Steiner Tree For given graph G=(V,E), Steiner tree is the smallest size sub graph G’ of G such that it covers all the vertices V’ of the query entered by the user Eg. Steiner tree for {a1, a3} is {a1, a3, p1} and {a1, a3, p3} Terminologies (Cont..)
Graph in which each word represents a unique path from root to leaf node. Each node is labeled with a character from the word Leaf node has a unique id (in alphabetic order) and has the inverted Each node is associated with keyword range [L,W] based on keywords in its sub- tree Indexing Structure- Trie
δ- step forward index used In a multiple keyword search, the user would have entered at least one keyword and might be in the process of entering the next keyword. Iteratively, the keywords that are at a distance i from the current vertex are determined to find search suggestions. δ is assumed to be fixed and the value of i varied as 0 ≤ i ≤ δ Multiple keyword Indexing
s Yu
p s Yu
1) Graph Partition: Partition Database graph into sub graphs(overlapping) To answer a query: Step1: Identify sub graphs in which the keyword occurs Step 2: Find suggestions within the sub graphs 2) Query Prediction: provide keyword suggestions to user based on probability to complete the query. Improving Efficiency
http://tastier.ics.uci.edu/ http://www.ics.uci.edu/~chenli/pub/sigmod2009-tastier.pdf http://www.ics.uci.edu/~chenli/pub/sigmod2009-tastier.pptx References