320 likes | 455 Views
Ontology-Based Free-Form Query Processing for the Semantic Web. Mark Vickers Brigham Young University MS Thesis Defense. Supported by:. Presentation Overview. Web Queries Explanation of AskOntos Demo Evaluation Future Work and Conclusion. Web Queries: Challenges.
E N D
Ontology-Based Free-Form Query Processing for the Semantic Web Mark Vickers Brigham Young University MS Thesis Defense Supported by:
Presentation Overview • Web Queries • Explanation of AskOntos • Demo • Evaluation • Future Work and Conclusion
Web Queries: Challenges Example: Searching for a car • Cannot specify constraints • Documents returned (usually too many) • Takes time to read through documents • Determine relevance • Find information (price, year, etc.)
Web Queries: Opportunities • Semantic web • Proposed ontology-based framework for making information machine-readable • Uses markup languages to identify information • “[A] search program can look for only those pages that refer to a precise concept…” -Tim Berners-Lee • How should semantic web be searched?
Solution: AskOntos – a Query System for the Semantic Web • Allows free-form queries over semantically annotated pages • Processes queries using information extraction • Returns tables of extracted values
Extraction Ontologies Object sets Relationship sets Participation constraints Lexical Non-lexical Primary object set Aggregation Generalization/Specialization
Extraction Ontologies Data Frame: Internal Representation: float Value Phrase Value Expression: \s*[$]\s*(\d{1,3})*(\.\d{2})? Left Context: $ Key Word Phrase Key Word Expression: ([Pp]rice)|([Cc]ost)| … Operation Phrase Operator: > Expression: (more\s*than)|(more\s*costly)|…
Step 1. Parse Query “Find me the and of all s – I want a ” price mileage red Nissan 1996 or newer >= Operator
Step 2. Find Related Ontology “Find me the price and mileage of all red Nissans – I want a 1996 or newer” Similarity value: 2 Similarity value: 5
Step 3. Formulate XQuery Expression • Conjunctive and aggregate queries run over selected ontology’s extracted values • Value-phrase-matching words determine conditions • Conditions: • Color = “red” • Make = “Nissan” • Year >= 1996 >= Operator
Step 3. Formulate XQuery Expression For Let Where Return
Step 4. Run XQuery Expression Over Ontology’s Extracted Data • Uses Qexo 1.7, GNU’s XQuery engine for Java • Orders results according to number of values
Evaluation of AskOntos • Success Measure: ability to translate free-form queries into formal queries • Extraction ontologies: car ads, house ads, countries, movies, and diamond ads • 3 rounds of testing • 50 queries each (gathered from other CS students) • 1st round discarded due to queries • Minor improvements on system between rounds
Query Translation Metrics “Find me the price and mileage of all red Nissans – I want a 1996 or newer.” for$docin document("file:///.../Car.OWL")/rdf:RDF for$Recordin$doc/owl:Thing … where($Color="red" orempty($Color)) and ($Make="Nissan" orempty($Make)) and ($Year="1996" orempty($Year)) return <Record ID="{$id}"> <Price>{$Price}</Price> <Color>{$Color}</Color> <Make>{$Make}</Make> <Year>{$Year}</Year> </Record> Human conversion Return-Clause Names: {Price, Mileage,Color, Make, Year} Conditions: {(Color,=,“red”), (Make,=,“Nissan”), (Year,>=,“1996”)} Automated conversion Return-Clause Names: {Price,Color, Make, Year} Conditions: {(Color,=,“red”), (Make,=,“Nissan”), (Year,=,“1996”)}
Result Analysis Common reasons for errors: 1. Word not in lexicon: “5 Bedrooms, 3 Bath, study, game room, 2 car garage, and < $250,000”
Result Analysis 2. Mistakes in regular expressions “Which countries use the euro?”
Result Analysis 3. Not enough context: “What are the models from 2005”
Conclusion/Contributions • AskOntos • Is a free-form query system for the semantic web • Applies information extraction for query processing • Answers questions with extracted data values • Contributions • Web queries that use semantic annotations • Web queries returning answers from extracted data • Processing free-form queries using ontologies
Future Work • Disjunction and negation • Fuzzy queries • Spellchecker
Simple Multiple-Record Documents Genealogy Domain – from Troy Walker’s thesis Highest-Fanout Separator VSM Separator
Scaling to the Web • Ontologies crawl and harvest web pages • Ontologies extract values from pages • Ontologies indexed • Queries extracted by relevant ontologies • Rely on Google-like technology