140 likes | 254 Views
Extracting RDF Data from Unstructured Sources Based on an RDF Target Schema. Tim Chartrand Research Supported By NSF. Motivation. Semantic Web – Global machine understandable knowledge base WWW – lots of information/data designed for human consumption
E N D
Extracting RDF Data from Unstructured Sources Based on an RDF Target Schema Tim Chartrand Research Supported By NSF
Motivation • Semantic Web – Global machine understandable knowledge base • WWW – lots of information/data designed for human consumption • DEG contribution – Extract data from the human readable web • Proposed solution – Extract WWW data and structure it in the Semantic Web format (RDF)
Overview of Proposed Research User HTML Page RDF Schema Extraction Ontology Extraction Engine RDF Data Relational Data
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data RDF – What is it? • Resource Description Framework • Language of the Semantic Web • Set of subject-predicate-object triples • [tim.html, creator, tim], [tim.html, type, thesis] • <RDF> <Description about=“tim.html”> <Creator>Tim</Creator> <Type>Thesis</Type> </Description> </RDF> creator Tim tim.html type Thesis
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data RDF Schema Basics Core Concepts • rdfs:class –The usual concept of a class. • Ex. Class Person • rdfs:subClassOf –Specifies the generalization of a class • Ex. Class Teacher is subClassOfPerson • rdfs:property –Can apply to a class. Has a value which. • Ex. Class Person has property Name • rdfs:domain – Classes to which a property can apply. • Ex. Property Name has domain Person • rdfs:range – Possible values of a property. • Ex. Property Name has range Literal • rdfs:subPropertyOf – Specifies the generalization of a property • Ex. Property Nickname is subPropertyOfName
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data Example RDF Schema Full Schema <rdfs:Class rdf:ID=“Person” …> … </rdfs:Class> <rdfs:Class rdf:ID=“Funeral“ …> … </rdfs:Class> <rdf:Property rdf:ID=“PFuneral" …> <rdfs:domain rdf:resource="#Person"/> <rdfs:range rdf:resource="#Funeral"/> </rdf:Property> <rdf:Property rdf:ID="Name" …> <rdfs:domain rdf:resource="#Person"/> <rdfs:range rdf:resource="&rdfs;Literal"/> </rdf:Property>
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data RDF Schema Graph
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data Extraction Ontology • Ontology Structure • Classes map to object sets • Properties map to binary relationship sets between classes • Literal properties map to relationship sets between classes and lexical data frames • Primary Object & Constraints – best guess based on heuristics\ • Data Frames • Need a data frame library • Match properties with data frame library • Specialize the property data frames
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data User Modification • Cardinality Constraints • Allow the user to edit any of the generated constraints • Keep track of changes – affects database schema • Data Frames • Provide a data frame editor • Allow user to modify the specialized data frames • Usually only add key words
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data Input Web Page
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data Relational Data
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data Extracted RDF Data Full RDF <obit:Person rdf:ID="1001" obit:Name="Lemar K. Adamson" … > <obit:Funeral rdf:resource="#5001" /> … </obit:Person> <obit:Funeral rdf:ID="5001" obit:FuneralAddress="1540 E. Linden" obit:FuneralDate="" obit:FuneralTime="10:00 a.m."> </obit:Funeral>
User HTML RDFS Ontology Extraction Engine RDF Data Relational Data RDF Data Graph
Conclusions • Converting RDF Schemas to Data Extraction Ontologies can be done with some user interaction. • The nature and amount of user interaction necessary for good data extraction is a good topic for research • Converting relational data to RDF data can be done automatically