120 likes | 275 Views
Geoinformatics. Metadata. Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata for the data stored in the database
E N D
Geoinformatics Metadata
Metadata • Generally speaking, metadata are data and information that describe and model data and information • For example, a database schema is the metadata for the data stored in the database • Metadata also includes data that represent properties and relationships among individual objects (instances) of any type (e.g., tectonic, sedimentary, geochemical) • This kind of metadata is typical of those in an ontology, which is a Semantic Web technology
Vocabulary • Metadata are used to specify vocabularies for exchanging data among different people in research groups or between machines • The vocabulary enriches the data so that software can interact with them, and manipulate them • Metadata tell software (algorithm, processors) what to do with the data and how to use them, and are of many kinds • Syntactic (how code statements are put together) • Structural (how data are structured; e.g., relational, XML, OO, graph) • Referent (set of allowable relations or properties connecting objects to instances, e.g., subclass, part-of, intersection, disjoint) • Domain specific • These metadata are in forms that include database schemas, XML documents, UML diagrams, and domain-specific entity hierarchies
Subsumtive/Partitive Hierarchy • Metadata allow representation of the format and organization of data (e.g., taxonomy, partonomy), for example: • Foliation isAPlanarStructuredescribes the subsumption relationship between foliation and planar structure • Subsumption is the word for the is-a relation. • If B is a kind of A, then we say A subsumes B, and B is subsumed by A. • Mineral partOf Rockdescribes the meronomic relation between minerals and rocks • This following includes the relationship between various types of data, e.g., AxialPlanarFoliation parallel FoldAxialPlane
Entailment • Metadata also allow us to formally specify and represent our domain knowledge by describing the information domain (i.e., field, such as geochemistry), thereby helping us to infer implicit statements from explicit statements through inference rules and entailment, e.g.: PlanarStructure has Strike. If we assert in our ontology that LinearStructuredisjointWithPlanarStructure, and Lineation isALinearStructure, and that LinearStructure has trend, the knowledge can then be used to make inferences about the underlying data • For example, if a structure, such as Foliation, has strike, we can infer that it is a planar structure;if it has trend, we infer that it is a linear structure
Applications of metadata • Metadata are used as a tool to describe and model domain information and knowledge, and can support several useful functionalities such as navigation, browsing, and retrieval of maps, images, and information about a specific geologic feature or phenomenon such as a rock or mineral sample • Metadata will enable knowledge-based decision support and management systems • The decision support system, when implemented, can be used by the decision makers in geoscience communities, and the knowledge management system will be used by geologists in these communities, trying to figure out the relationship between cross-disciplinary geological facts and phenomena (e.g., mineral reserve and petrology; geochemistry and water quality)
Types of metadata • Metadata can describe content-independent information, such as rock sample number or the date the sample was taken • The URI (Uniform Resource Identifier) associated with a geological resource is another example of this kind of metadata • Content-based metadata, on the other hand, describe the structural information of documents or artifacts, and domain-specific terminology and vocabulary, which capture both intra- and inter-domain relationships among data (i.e., within one field or between different fields, for example within the Geochemistry field, or between Geochemistry and Petrology fields) • While the content-independent metadata describe the format and organization of the underlying data, the domain-specific metadata are the most relevant, and capture information about the domain (e.g., stratigraphy, geochemistry), and are the most useful as far as scientific semantics is concerned
Metadata are commonly developed in isolation, and require intermediary software for interchange, interoperability, and integration • The Semantic Web can help in developing systems that allow efficiently linking and integrating distributed data to anything in a community • Decisions on where to explore for a specific mineral or drill wells for water or oil depend on the accuracy of the data, and how these data (e.g., aquifer and rock type or contaminants) are related to each other • Currently, these data are scattered in publications and unrelated databases and worksheets
Ontology • Structured vocabularies define the metadata for specific fields (domains). The more domain-specific the metadata, the more useful they become to model the domain knowledge • Therefore, the terms in the vocabularies should capture consensual domain terms and interrelationships among these terms • Among the different types of vocabularies, ontologies are at the top of the hierarchy in providing the most useful and complete metadata, hence semantics • Ontology is a formal specification and model of a domain’s knowledge (e.g., knowledge of Geochemistry). It defines the shared vocabulary and the interrelationships that exist among the real individual objects within a specific field or domain of discourse, such as plate tectonics
Metadata Frameworks • Metadata frameworks are specifications that allow creating, manipulating, and querying metadata descriptions, and include those that are XML-, RDF-, and OWL-based (among others) • Each of these frameworks consists of a data model, semantics (applying RDF, RDFS, OWL), serialization format (e.g., XML, N3), and query language (e.g., SPARQL) • The XML-based metadata framework is used to capture both content (separate from presentation) and metadata, but not semantics • Schema in XML exists with the data as tag names • This allows the self-describing content to include both data and metadata
RDF • The RDF-based metadata presentation is based on XML, and is designed to describe metadata for resources on the Web • RDF uses a subject-predicate-object triple graph format • The subject and object are resources, which on the web can be anything said about anything by anyone • RDF triple: Sample analysis Chemistry, means that a specific sample (a resource) has analysis (predicate) given by the chemistry resource (which can be a trace element list of data)
OWL • OWL-based metadata framework, which builds on RDF and RDF Schema (RDFS), allows construction of more complex semantic expressions at the schema and data levels • OWL allows defining class and class membership and properties between classes (e.g., subclass-of, disjoint-from, equivalent) • Among many other constructs, OWL allows defining domain and range for each class • OWL-QL and SPARQL are two query languages for the OWL language.