250 likes | 271 Views
Explore fundamental questions about knowledge, facts, and reasoning. Learn how to distill knowledge from digital web data, extract information from web pages, and turn raw symbols into knowledge. Discover how to find specific data, such as the price and mileage of red Nissans from 1990 or newer models. Delve into ontology, epistemology, and logic to enhance your understanding of information retrieval and data extraction techniques.
E N D
WoK: A Web of Knowledge David W. Embley Brigham Young University Provo, Utah, USA
A Web of Pages A Web of Facts • Birthdate of my great grandpa Orson • Price and mileage of red Nissans, 1990 or newer • Location and size of chromosome 17 • US states with property crime rates above 1%
Toward a Web of Knowledge • Fundamental questions • What is knowledge? • What are facts? • How does one know? • Philosophy • Ontology • Epistemology • Logic and reasoning
Ontology • Existence asks “What exists?” • Concepts, relationships, and constraints
Epistemology • The nature of knowledge asks: “What is knowledge?” and “How is knowledge acquired?” • Populated conceptual model
Logic and reasoning • Principles of valid inference – asks: “What is known?” and “What can be inferred?” • For us, it answers: what can be inferred (in a formal sense) from conceptualized data. Find price and mileage of red Nissans, 1990 or newer
Logic and reasoning • Principles of valid inference – asks: “What is known?” and “What can be inferred?” • For us, it answers: what can be inferred (in a formal sense) from conceptualized data. Find price and mileage of red Nissans, 1990 or newer
Making this Work How? • Distill knowledge from the wealth of digital web data • Annotate web pages Annotation Annotation … … Fact Fact Fact
Turning Raw Symbols into Knowledge • Symbols: $ 11,500 117K Nissan CD AC • Data: price(11,500) mileage(117K) make(Nissan) • Conceptualized data: • Car(C123) has Price($11,500) • Car(C123) has Mileage(117,000) • Car(C123) has Make(Nissan) • Car(C123) has Feature(AC) • Knowledge • “Correct” facts • Provenance
Actualization (with Extraction Ontologies) Find me the price and mileage of all red Nissans – I want a 1990 or newer.
Explanation: How it Works • Extraction Ontologies • Semantic Annotation • Free-Form Query Interpretation
Extraction Ontologies Object sets Relationship sets Participation constraints Lexical Non-lexical Primary object set Aggregation Generalization/Specialization
Extraction Ontologies Data Frame: Internal Representation: float Values External Rep.: \s*[$]\s*(\d{1,3})*(\.\d{2})? Left Context: $ Key Word Phrase Key Words: ([Pp]rice)|([Cc]ost)| … Operators Operator: > Key Words: (more\s*than)|(more\s*costly)|…
Free-Form Query Interpretation • Parse Free-Form Query (wrt data extraction ontology) • Select Ontology • Formulate Query Expression • Run Query Over Semantically Annotated Data
Parse Free-Form Query “Find me the and of all s – I want a ” price mileage red Nissan 1996 or newer >=Operator
Select Ontology “Find me the price and mileage of all red Nissans – I want a 1996 or newer”
Formulate Query Expression • Conjunctive queries and aggregate queries • Mentioned object sets are all of interest. • Values and operator keywords determine conditions. • Color = “red” • Make = “Nissan” • Year >= 1996 >= Operator
Formulate Query Expression For Let Where Return
Conclusion & Current & Future Work Key challenge: simplicity • A simple way to annotate web pages • Simple but accurate query specification • A simple way to create extraction ontologies www.deg.byu.edu