1.12k likes | 1.28k Views
Information Artifact Ontology: General Background. Barry Smith. Military Doctrine and Standardization of Terminology. 3rd Century BC Standardized beacon signals used by Chinese military along Great Wall 1792
E N D
Information Artifact Ontology: General Background Barry Smith
Military Doctrine and Standardization of Terminology 3rd Century BC Standardized beacon signals used by Chinese military along Great Wall 1792 Drill manual for the units of the Continental Army to respond uniformly to commands during the Revolutionary War 1943 General James Gavin’s Training Memorandum on the Employment of Airborne Forces
General James Gavin, On to Berlin: Battles of an Airborne Commander 1943-1946 for success of the D-Day invasion ‘one of our most critical needs was to standardize the operating practices of our forces. … even simple terminology had to be agreed upon. … British flew in what they called “bomber stream” formations, We preferred troop-carrier group formations of 36 planes that flew in a V ... We referred to landing area as the “jump area,” the British called it “drop zone,” …’
Current state • DOD Dictionary of Military and Associated Terms (Joint Publication 1-02) • New military dictionaries and terminology artifacts continue to be developed • Dominant ethos: Library Science (all terminologies are equal), Lexicography (logical consistency of definitions is not important)
Two kinds of data • Data about entities in the world (topics, subject-matters) standard ontologies 2. Data about the information artifacts in which these entities are represented (= metadata) Information Artifact Ontology and extensions, including IAO-Intel
Information Content Entities (ICEs) • ICEs are about something in reality (they have this something as a subject; they represent, or mention or describe this something; they inform us about this something). • Aboutnessmay be identifiable from different perspectives. Thus one analyst may interpret a given ICE as being about the geography of a given encampment; another may view it as providing information about the morale of those encamped there.
Information artifact • (roughly) an entity created through some deliberate act or acts by one or more human beings, and which endures through time, potentially in multiple (for example digital or printed) copies Examples: a diagram on a sheet of paper, a video file, a map on a computer monitor, an article in a newspaper, a message on a network, the output of some querying process in a computer memory
What IAO is for • IAO is not designed to replace existing ontological or other standards • lots of documents exist conforming to lots of different standards • purpose of IAO is to allow generation of the needed metadata in a uniform, non-redundant and algorithmically processablefashion
Attributes of Information Artifacts • Examples • Purpose • Lifecycle Stage (draft, finished version, revision) • Language, • Format • Provenance • Source (person, organization) • These are generic attributes, common to all areas • IAO will contain a Low-Level Ontology module for each dimension
Generic Purpose Attributes • Descriptivepurpose: scientific paper, newspaper article, after-action report • Prescriptive purpose: legal code, license, statement of rules of engagement • Directive purpose (of specifying a plan or method for achieving something): instruction, manual, protocol • Designative purpose: a registry of members of an organization, a phone book, a database linking proper names of persons with their social security numbers
Use of IAO-Intel – Example:Digitalizing an MCOO • IA #1 - Modified Combined Obstacle Overlay (MCOO) - a joint intelligence preparation of the operational environment product used to portray the militarily significant aspects of the operational environment, such as obstacles restricting military movement, key geography, and military objectives.
Digitalizing an MCOO • Annotations to the attributes of IA#1 • ICE: MCOO • IBE: Acetate Sheet • uses-symbology MIL-STD-2525C • authored-by person #4644 • Annotations relating to the aboutness of IA#1 • Avenue of Approach • Strategic Defense Belt • Amphibious Operations • Objective
top level mid-level domain level Basic Formal Ontology (BFO) Extension Strategy + Modular Organization
top level mid-level (generic hub) domain level (spokes populating downwards) Basic Formal Ontology (BFO) Each module built by downward population from its parent
Users of BFO Examples AIRS Ontologies cROP Ontologies MilPortal Ontologies NIF Standard Ontologies OBO Foundry Ontologies OAE Ontology of Adverse Events EnvO Emotion Ontology IDO Infectious Disease Ontology (NIAID) US Army Biometrics Ontology http://www.ifomis.org/bfo/users
Continuant Occurrent BFO
Continuant Occurrent BFO Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant
Continuant Occurrent BFO Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant is tied to just one bearer
Continuant Occurrent BFO Generically Dependent Continuant can migrate from one bearer to another Independent Continuant Specifically Dependent Continuant is tied to just one bearer
Continuant Occurrent BFO Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant universals this gene sequence, this digital image this man, that book this excitation pattern, that pattern of piles of ink instances
Continuant BFO Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant Material Entity Disposition Quality Role
Continuant BFO IAO Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant Material Entity Quality Information Quality Entity Information Bearing Entity depends_on
Continuant BFO IAO Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant Material Entity Quality Information Content Entity
Continuant BFO IAO Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant Material Entity Quality Information Content Entity Information Quality Entity Information Bearing Entity concretized_by depends_on
Generically Dependent Continuant Independent Continuant Specifically Dependent Continuant Material Entity Quality Information Content Entity Information Quality Entity Information Bearing Entity concretized_by depends_on universals this pdf file this digital image this hard drive, that book this excitation pattern, that pattern of piles of ink instances
Universals and Instances (from Bill Mandrick) Geographic Coordinates Set designates Geopolitical Entity Spatial Region instance_of has location is_a Village Name has location Distance Measurement Result designates Village Well Latrine instance_of instance_of instance_of instance_of instance_of ’16 meters’ ‘VT 334 569’ ‘Khanabad Village’ measurement_of located in located near
BFO: Specifically Dependent Continuant BFO: Generically Dependent Continuant BFO: Independent Continuant IAO and BFO Information Bearing Entity (IBE) Information Structure Entity (ISE) Information Content Entity (ICE) Information Quality Entity (Pattern) (IQE)
Information Artifacts artifact =def. an entity created through some deliberate act or acts by one or more human beings and which endures through time information artifact: an artifact that created to serve as a bearer of information (a) information bearing entity (IBE) – a hard drive, a passport, a piece of paper with a drawing of a map (b) information content entity (ICE) – an entity which is about something and which can potentially exist in multiple (for example digital or printed) copies – a jpg file, a pdf file
IAO: information content entity =def. an entity that is generically dependent on some artifact and stands in the relation of aboutnessto some entity Problems of non-referring information entities Problems of information structure entities
Types and tokens Copyable information artifacts can exist both as tokensPeirce and as typesPeirce Token = the particular information artifact of interest, tied to some particular physical information bearer: the photographic image on this piece of paper retrieved from this enemy combatant Type = The copyable information content that is carried by the artifact in question. The same photographic image type may be printed out in multiple paper tokens Warning: this is not the same as the instance-class distinction
The Dublin Core: How not to solve the problem of creating consistent information artifact metadata
Dublin Core Metadata Initiative (DCMI) an open organization supporting innovation in metadata design and best practices across the metadata ecology http://dublincore.org/ Resource (as in ‘RDF’) + 15 basic ‘elements’:
Dublin Core Metadata Initiative (DCMI) An open organization supporting innovation in metadata design and best practices across the metadata ecology http://dublincore.org/
The Core • Resource (as in ‘RDF’) + 15 basic ‘elements’:
1) What’s a “resource”? A resource is anything that has identity. Familiar examples include an electronic document, an image, a service (e.g., "today's weather report for Los Angeles"), and a collection of other resources. Assumption: resource = information artifact An Element is a characteristic that a resource may “have”, such as a Title, Publisher, or Subject. 2) How do “elements” apply to “resources”?
The Core (cont.) The same resource can be instantiated in different ways Format:The file format, physical medium, or dimensions of the resource. Examples of dimensions include size and duration. Recommended best practice is to use a controlled vocabulary such as the list of Internet Media Types [MIME]. Example: image/jpeg.
The Core (cont.) What describes the content / topic / subject-matter? Title: The name given to the resource. Description: An account of the content of the resource. Description may include but is not limited to: an abstract, table of contents, reference to a graphical representation of content or a free-text account of the content. Subject: The topic of the content of the resource. Typically, a subject will be expressed as keywords or key phrases or classification codes that describe the topic of the resource.
Benefits of Dublin Core • Available in multiple formats • W3C recommended • Mapping to PROV
Problems with Dublin Core • Scope not defined (‘anthing that has identity’) • Does not provide logical definitions, but relies rather on vague natural language expressions (including use of “scare” “quotes” to warn the user that terms are not intended literally) • Provides only suggestive guidance as to use of associated standards • Does not interoperate well with other (topic) ontologies
Confuses words and things • Source: A reference to a resource from which the present resource is derived. The present resource may be derived from the Source resource in whole or part.
Engages in sloppy bundling Type: The nature or genre of the content of the resource. Type includes terms describing general categories, functions, genres, or aggregation levels for content. What is ‘content of the resource’? Is the nature of the content distinct from the nature of the resource? No taxonomic organization, but rather a tangled hierarchy No distinction between things (continuants) and processes (occurrents) – consider performance of a work
Does not address the goals of a Metadata Ontology • Ability to expand consistently to new application areas • Ability to gracefully integrate with domain ontologies and with other IA-related ontologies • Ability to represent metadata of different categories • Complex application-specific content • specific ways in which one IA relates to another IA • Content vs. Bearers of content
Requirements to Achieve These Goals • Conformance to ontology best practices • http://ncorwiki.buffalo.edu/index.php/Distributed_Development_of_a_Shared_Semantic_Resource • http://techwiki.openstructs.org/index.php/Ontology_Best_Practices • http://kmi.open.ac.uk/events/iswc07-semantic-web-intro/pdf/5.%20Ontology%20Design.pdf • Conformance to an upper level ontology as starting point for coherent definitions • Separation of aspects of an information artifact such as physical bearer, content, content organization
DC Does Not Conform to Best Practices • Location Period Or Jurisdiction is defined in the DC hierarchy as a subclass of Location
Problems with verbal definitions • PROVENANCE – “A statement of any changes in ownership and custody of the resource since its creation that are significant for its authenticity, integrity, and interpretation.” • The same definition is applied to the class and the property: PROVENANCE STATEMENT that is the Range of PROVENANCE is defined in exactly the same way.