150 likes | 272 Views
Google Query Language -- a DSL for Advanced Google Searching. Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science 03/04/2005. Background. PhD research: Compiler Development Environment (CDE)
E N D
Google Query Language -- a DSL for Advanced Google Searching Xiaoqing Wu Advisor: Dr. Barrett R. Bryant Department of Computer and Information Science 03/04/2005
Background • PhD research: Compiler Development Environment (CDE) • Compiler, interpreter, and integrated development environment automatic generation • Several Domain-Specific Languages have been developed on top of CDE • GQL: an application based on CDE • Internet -- Database • Google --Database Management System (DBMS) • GQL -- Structured Query Language (SQL)
Google: more than keyword searching • Language preference • File format, date, occurrences, domain • Image, forum, shopping search
Query customization in Google • Filling forms • Writing meta-tokens directly • allintext: Xiaoqing Wu filetype:pdf
Why GQL (I)? • Forms are not flexible • Fixed • Can’t be saved and reused • Filling multiple forms is time-consuming • Mouse operation is slower than keyboard operation
Why GQL (II)? • Meta-tokens are not designed for end-users • Not user friendly • No syntax provided • No type-checking • Ambiguous keyword1 keyword3 OR keyword4 "keyword2"
GQL: A well-formed DSL • User friendly grammar • Natural, SQL-like syntax rules, easy to follow • No ambiguity • IDE support • Automatic syntax and type checking • Program based query • Query could be saved and reused • Search from old query • Flexible: numerous forms!
No more forms! search {key}* from file where {constraint}*
GQL Syntax Grammar [1] query ::= SEARCH|IMAGE o_keylist occurrence constraints withinstmt [2] o_keylist ::= keylist | [3] keylist ::= key | keylist COMMA key [4] key ::= word | noword | orwordlist | exactword [5] word ::= STRING [6] noword ::= NOT word [7] orwordlist ::= orword OR orword | orwordlist OR orword [8] orword ::= word | exactword [9] exactword ::= QSTRING [10] occurrence ::= FROM OCCVALUE | [11] constraints ::= WHERE constraintlist | [12] constraintlist ::= constraint | constraintlist constraint [13] constraint ::= domain | filetype [14] domain ::= indomain | outdomain [15] indomain ::= DOMAIN EQ url [16] outdomain ::= DOMAIN NE url [17] url ::= QSTRING [18] filetype ::= acceptfiletype | rejectfiletype [19] acceptfiletype ::= TYPE EQ TYPEVALUE [20] rejectfiletype ::= TYPE NE TYPEVALUE [21] withinstmt ::= WITHIN QSTRING |
Current status • Basic GQL compiler • IDE supporting multiple document management • Program storage • Editing • Compiling, type-checking and execution • Functionality including all features of Google web & image search • Search within old queries
Future work • Extending the grammar to implement all the functionality provided by Google • Adding more strict type-checking for source programs written in GQL • Search result integration.
Conclusion • To provide more flexibility in online search, a SQL-like query language is developed in the Google query domain. • Language programs are used to substitute the provided query forms from Google, analogical to SQL and query forms in DBMS, e.g. MS-Access. • The idea could be generalized to other domains, especially in online searching, e.g. airfare searching.