1 / 137

Semantic Understanding

Semantic Understanding. An Approach Based on Information Extraction Ontologies. David W. Embley Brigham Young University. Funded in part by the National Science Foundation. Presentation Outline. Grand Challenge Meaning, Knowledge, Information, Data Fun and Games with Data

hillmanc
Download Presentation

Semantic Understanding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Understanding An Approach Based on Information Extraction Ontologies David W. Embley Brigham Young University Funded in part by the National Science Foundation

  2. Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges

  3. Can we quantify & specify the nature of this grand challenge? Grand Challenge Semantic Understanding

  4. Grand Challenge Semantic Understanding “If ever there were a technology that could generate trillions of dollars in savings worldwide …, it would be the technology that makes business information systems interoperable.” (Jeffrey T. Pollock, VP of Technology Strategy, Modulant Solutions)

  5. Grand Challenge Semantic Understanding “The Semantic Web: … content that is meaningful to computers [and that] will unleash a revolution of new possibilities … Properly designed, the Semantic Web can assist the evolution of human knowledge …” (Tim Berners-Lee, …, Weaving the Web)

  6. Grand Challenge Semantic Understanding “20th Century: Data Processing “21st Century: Data Exchange “The issue now is mutual understanding.” (Stefano Spaccapietra, Editor in Chief, Journal on Data Semantics)

  7. Grand Challenge Semantic Understanding “The Grand Challenge [of semantic understanding] has become mission critical. Current solutions … won’t scale. Businesses need economic growth dependent on the web working and scaling (cost: $1 trillion/year).” (Michael Brodie, Chief Scientist, Verizon Communications)

  8. What is Semantic Understanding? Semantics: “The meaning or the interpretation of a word, sentence, or other language form.” Understanding: “To grasp or comprehend [what’s] intended or expressed.’’ - Dictionary.com

  9. Can We Achieve Semantic Understanding? “A computer doesn’t truly ‘understand’ anything.” … But computers can manipulate terms “in ways that are useful and meaningful to the human user.” - Tim Berners-Lee Key Point: it only has to be good enough. And that’s our challenge and our opportunity!

  10. Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges

  11. Meaning • Knowledge • Information • Data Information Value Chain Translating data into meaning

  12. Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]

  13. Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement (ontology) • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]

  14. Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement (ontology) • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]

  15. Foundational Definitions • Meaning: knowledge that is relevant or activates • Knowledge: information with a degree of certainty or community agreement (ontology) • Information: data in a conceptual framework • Data: attribute-value pairs - Adapted from [Meadow92]

  16. Data • Attribute-Value Pairs • Fundamental for information • Thus, fundamental for knowledge & meaning

  17. Data • Attribute-Value Pairs • Fundamental for information • Thus, fundamental for knowledge & meaning • Data Frame • Extensive knowledge about a data item • Everyday data: currency, dates, time, weights & measures • Textual appearance, units, context, operators, I/O conversion • Abstract data type with an extended framework

  18. Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges

  19. ? Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm

  20. ? Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm

  21. ? Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm

  22. ? Olympus C-750 Ultra Zoom Sensor Resolution 4.2 megapixels Optical Zoom 10 x Digital Zoom 4 x Installed Memory 16 MB Lens Aperture F/8-2.8/3.7 Focal Length min 6.3 mm Focal Length max 63.0 mm

  23. Digital Camera Olympus C-750 Ultra Zoom Sensor Resolution: 4.2 megapixels Optical Zoom: 10 x Digital Zoom: 4 x Installed Memory: 16 MB Lens Aperture: F/8-2.8/3.7 Focal Length min: 6.3 mm Focal Length max: 63.0 mm

  24. ? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117

  25. ? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117

  26. ? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117

  27. ? Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117

  28. Car Advertisement Year 2002 Make Ford Model Thunderbird Mileage 5,500 miles Features Red ABS 6 CD changer keyless entry Price $33,000 Phone (916) 972-9117

  29. ? Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04

  30. ? Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04

  31. Airline Itinerary Flight # Class From Time/Date To Time/Date Stops Delta 16 Coach JFK 6:05 pm CDG 7:35 am 0 02 01 04 03 01 04 Delta 119 Coach CDG 10:20 am JFK 1:00 pm 0 09 01 04 09 01 04

  32. ? Monday, October 13th Group A W L T GF GA Pts. USA 3 0 0 11 1 9 Sweden 2 1 0 5 3 6 North Korea 1 2 0 3 4 3 Nigeria 0 3 0 0 11 0 Group B W L T GF GA Pts. Brazil 2 0 1 8 2 7 …

  33. ? Monday, October 13th Group A W L T GF GA Pts. USA 3 0 0 11 1 9 Sweden 2 1 0 5 3 6 North Korea 1 2 0 3 4 3 Nigeria 0 3 0 0 11 0 Group B W L T GF GA Pts. Brazil 2 0 1 8 2 7 …

  34. World Cup Soccer Monday, October 13th Group A W L T GF GA Pts. USA 3 0 0 11 1 9 Sweden 2 1 0 5 3 6 North Korea 1 2 0 3 4 3 Nigeria 0 3 0 0 11 0 Group B W L T GF GA Pts. Brazil 2 0 1 8 2 7 …

  35. ? Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm

  36. ? Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm

  37. ? Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm

  38. Treadmill Workout Calories 250 cal Distance 2.50 miles Time 23.35 minutes Incline 1.5 degrees Speed 5.2 mph Heart Rate 125 bpm

  39. ? Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,000 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW

  40. ? Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,000 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW

  41. ? Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,000 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW

  42. Maps Place Bonnie Lake County Duchesne State Utah Type Lake Elevation 10,100 feet USGS Quad Mirror Lake Latitude 40.711ºN Longitude 110.876ºW

  43. Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges

  44. Information Extraction Ontologies Source Target Information Extraction Information Exchange

  45. What is an Extraction Ontology? • Augmented Conceptual-Model Instance • Object & relationship sets • Constraints • Data frame value recognizers • Robust Wrapper (Ontology-Based Wrapper) • Extracts information • Works even when site changes or when new sites come on-line

  46. CarAds Extraction Ontology <ObjectSet x="329" y="51" lexical="true" name="Mileage" id="osmx50"> <DataFrame> <InternalRepresentation> <DataType typeName="String"/> </InternalRepresentation> <ValuePhraseList> <ValuePhrase hint="Mileage Pattern 1"> <ValueExpression color="ffffff"> <ExpressionText>[1-9]\d{0,2}[kK]</ExpressionText> </ValueExpression> <LeftContextExpression color="ffffff"> … <KeywordPhraseList> <KeywordPhrase hint=“New phrase 1”> <KeywordExpression color=“ffffff”> <ExpressionText>\bmiles\b</ExpressionText> … <ObjectSet x="329" y="51" lexical="true" name="Mileage" id="osmx50"> <DataFrame> <InternalRepresentation> <DataType typeName="String"/> </InternalRepresentation> <ValuePhraseList> <ValuePhrase hint="Mileage Pattern 1"> <ValueExpression color="ffffff"> <ExpressionText>[1-9]\d{0,2}[kK]</ExpressionText> </ValueExpression> <LeftContextExpression color="ffffff"> … <KeywordPhraseList> <KeywordPhrase hint=“New phrase 1”> <KeywordExpression color=“ffffff”> <ExpressionText>\bmiles\b</ExpressionText> …

  47. Extraction Ontologies:An Example ofSemantic Understanding • “Intelligent” Symbol Manipulation • Gives the “Illusion of Understanding” • Obtains Meaningful and Useful Results

  48. Presentation Outline • Grand Challenge • Meaning, Knowledge, Information, Data • Fun and Games with Data • Information Extraction Ontologies • Applications • Limitations and Pragmatics • Summary and Challenges

  49. A Variety of Applications • Information Extraction • Semantic Web Page Annotation • Free-Form Semantic Web Queries • Task Ontologies for Free-Form Service Requests • High-Precision Classification • Schema Mapping for Ontology Alignment • Record Linkage • Accessing the Hidden Web • Ontology Discovery and Generation • Challenging Applications (e.g. BioInformatics)

  50. Application #1Information Extraction

More Related