ODE: Ontology-Assisted Data Extraction

ODE: Ontology-Assisted Data Extraction Weifeng Su, Jiying Wang, Frederick H. Lochovsky Summarized by Joseph Park

Overview • “Web databases…compose what is referred to as the deep Web” • The goal of data extraction: • (1) Query result sectionidentification - decides what section in a dynamically generated query result page contains the data that need to be extracted. • (2) Record segmentation - segments the query result section into records and extracts them. • (3) Data value alignment - aligns the data values from multiple records that belong to the same attribute so that they can be arranged into a table. • (4) Label assignment - assigns a suitable, meaningful label (i.e., an attribute name) to each column in an aligned table.

Problems • Automatically extract data from query results • Limitations of other systems: • Incapable of processing either zero or few query results. • Vulnerable to optional and disjunctive attributes. • Incapable of processing nested data structures. • No label assignment.

Approach • ODE – Ontology-assisted data extraction • PADE wrapper • Query result annotation • Attribute matching • Ontology construction

Approach continued • Query result section identification • Record segmentation • Data value alignment and label assignment • MaxEnt model is used

Experimental Results Extraction performed using DeLa

Conclusion • Can only label attributes that appear in query result pages • References a few DEG papers • DKE99, Tisp, TANGO • Could take advantage of MaxEnt for pre-labeling data • Need to look into DeLa for data extraction

ODE: Ontology-Assisted Data Extraction

ODE: Ontology-Assisted Data Extraction

Presentation Transcript

C4ISR Data Ontology

Web Data Extraction

Microwave Assisted Processing Extraction

Ontology-based Information Extraction

Navigating ODE Public Data Reports

Data extraction

Ontology Based Extraction of RDF Data from the World Wide Web

Data-Extraction Ontology Generation by Example

A Framework for Extraction Plans and Heuristics in an Ontology-Based Data-Extraction System

ODE: Ontology-assisted Data Extraction

Semi-Automatically Generating Data-Extraction Ontology

Data extraction services

Dentists Data Extraction

DATA EXTRACTION SERVICES

Data Extraction

Data-Extraction Ontology Generation by Example

ODE - Online Data To Experimenters

Data Extraction

Companies Data Extraction

Data Extraction