180 likes | 310 Views
Facilitating Semantic Web Search with Embedded Grammar Tags (EGTs). Gautham K.Dorai Yaser Yacoob Department of Computer Science University of Maryland – College Park. The Future – A Forecast. What is the value of Nasdaq today ?. The value is 10746.2!!. Speech Grammar based Search Engine.
E N D
Facilitating Semantic Web Search with Embedded Grammar Tags (EGTs) Gautham K.Dorai Yaser Yacoob Department of Computer Science University of Maryland – College Park
The Future – A Forecast What is the value of Nasdaq today ? The value is 10746.2!! Speech Grammar based Search Engine WWW ???
Roadmap • Forecast • Problem Statement and Our Solution • Related Work • Demonstration • Summary • Future Work
Problem Statement(1) • Web Content is represented for human consumption • Software agents do not have interpretive tools for semantic information recovery • Hence agents cannot understand web content
Problem Statement(2) • Web content does not support queries by natural language interaction • e.g. : Query : “What is the weather at College Park” - searches lead to links on related subject content
Our Solution(1) • We embed natural language queries in the web content • Embedded Grammar Tags (EGTs) represent queries in a general (parseable) format • Discovery of relevant response by EGT matching
EGT – The Big Picture HTML Page EGT Annotation Internet QUERY NLP EGT Search Web Search Engines
Our Solution(2) • EGT uses the general BNF grammar format to represent queries • E.g.: <ROBOTGRAM-IN> * [is] the temperature [is] at College Park </ROBOTGRAM-IN> • Captures queries such as - What is the temperature at College Park ? - Tell me what the temperature is at College Park ?
Our Solution(3) • EGT structure : * can be replaced by any word/set of words () mandatory words for EGT match [] optional words • Web Content is annotated with EGTs e.g: <td valign= “bottom”><span class =“redtemps”> <ROBOTGRAM-IN> * [is] the weather [is] at College Park [is]</ROBOTGRAM-IN> mostly sunny
Our Solution(4) • EGTs - More examples : <span class = “FSnPSmDk”> Wind: </span> <ROBOTGRAM-IN> * [is the] [wind] (speed|velocity) [of the] [wind] [is] at <city> [is] </ROBOTGRAM-IN> 3mph <td align = “RIGHT” bgcolor=“#DDDDDD”><font face = “arial , helevetica, sans-serif” size = “2” class = “mkcharttxt”> <ROBOTGRAM-IN> * [is] [Nasdaq *] [the] (value|quote|price) [of Nasdaq] * </ROBOTGRAM-IN> 2222.42 </font></td>
Related Work(1) • Natural Language Processing (NLP) – attempt to uncover meaning in HTML content • DAML, RDF, SHOE, XML – add metadata to describe the web content Facilitate more efficient content search
Related Work(2) • E.g. : <v:Email> gautham@engineer.com </v:Email> <v:Name> Gautham Dorai </v:Name> - special tags (<v:Email>,<v:Name>) to describe content • Fine grained natural language queries on the content require an expandable universally available tag database
Why EGTs ? (1) • RDF Triple can also be used e.g: College Park - [weatherat] mostly sunny Nasdaq - [quoteof] 2222.42 • But EGTs are naturally expandable and more amenable to change • Simplifies search engine complexity
Why EGTs ?(2) • EGTs describe content in an unconstrained format • EGTs are already present in speech recognition technology • Ease of transition from visualphone browsers
Demonstration • We annotate a given home-page with EGTs • The user can query the content in natural language • Search engine parses the web page for EGT match and responds
EGT Annotator (Preliminary) • Create a template page that is EGT-ready, i.e., EGTs are transparent to the user • The template is for home-pages at CS Dept. • The user can simply copy information from the HTML page onto the annotator
Summary • EGTs enable software agents to respond to natural language queries • An EGT search engine can be implemented on top of conventional content search engines • Responses are constructed based on the extracted information from an EGT match
Future Work • Expandable Universal Grammar - Universally Available query grammar packages • EGT Recognition Metrics - Statistical Analysis to search for EGT matches • EGT Crawler - Crawler that parses through EGT annotated web-pages