430 likes | 572 Views
Metadata Registry Standards: A Key to Information Integration. Jim Carpenter Bureau of Labor Statistics. MIT Seminar June 3, 1999. Previously presented to DAMA-NCR by Judith Newton, NIST May 11, 1999 see www.dama-ncr.org. Agenda. Specification and Standardization of Data Elements:
E N D
Metadata Registry Standards: A Key to Information Integration Jim Carpenter Bureau of Labor Statistics MIT Seminar June 3, 1999 Previously presented to DAMA-NCR by Judith Newton, NIST May 11, 1999 see www.dama-ncr.org
Agenda • Specification and Standardization of Data Elements: • ISO 11179, Parts 1-6 • Metamodel for Management of Shareable Data, • ANS X3.285 • Specification of Data Value Domains, • ISO TR 15452 • NWI for Content Issues
Parts of 11179 Status • Part1: Framework DIS • Part2: Classification DIS • Part 3: Basic Attributes IS • Part 4: Formulation of Definitions IS • Part 5: Naming and Identification IS • Part 6: Registration IS IS = International Standard DIS = Draft IS
Part 1 - Framework • Definitions • Fundamental Concepts • Other parts • Informative Annexes
Definition: Data Element • A unit of data for which the • definition, • identification, • representation, and • permissible values • are specified by means of a set of attributes.
Database, File, Etc. Database, File, Etc. Transaction, Exchange Unit, Etc. Data Element Record, Segment, Class, Tuple, Etc. Identifier Definition Name Value Domain Etc. Data Element Identifier Definition Name Value Domain Etc. Data Element Identifier Definition Name Value Domain Etc. Field, Column, Etc. Character, Image, Sound, Etc.
Fundamental Model • Taken From Data Modeling • 3 Components • object class • property • representation
Definition: Object Class • Things for which to Store Data • Entities in E-R Models • Classes in O-O Models • Examples: • Employers – Persons • Automobiles – Orders • ….
Definition: Property • A peculiarity common to all members of an object class. • Distinguishes or Describes Objects • Attributes or Data Members in Models • Examples: • Identifier – Age • Address – Location • ...
Definition: Representation • The combination of • a representation class, • a value domain, • a datatype, • a unit of measure (if necessary) • a character set (if necessary)
Data Element Example Data Element Object Class Flower Property Color Representation String:{red | blue}
Part 2 - Classification What forms can a classification structure take? • Keywords • Controlled word lists • Terms from models • Thesaurus • Taxonomy • Ontology • Acyclic directed graph, lattice • Multiple inheritance
Classification -Fundamental Notions • Each node in a classification structure is a taxon (plural: taxa). • Given a classification structure, any taxa relating to a data element can be recorded • The taxa can be recorded in a separate “classification” attribute • With adequate software, users could access and navigate the classification structure • A nonintelligent identifier for each taxon helps to deal with change
Part 2 Status • ISO • Draft International Standard • Continuing R&D • Search engines • Middleware - agents, mediators, request brokers • XML tags • New ISO project: terminology management in metadata registries
Part 3 - Basic Attributes • “Basic attributes” of data elements • independent of their usage in application systems, data bases, data interchange messages. • Recognizes need for additional attributes. • No logical or physical structure of the data implied.
Categories of Basic Attributes • Identification of a data element • Definition of a data element • Relations among data elements • Representation of data element values • Administrative: management and control
Summary • Part 3 is a good start to establishing an unambiguous set of specifics documenting data elements. • However, • Further work on the other ISO 11179 parts and beyond has resulted in many refinements and advances addressing a variety of data-related concepts. • A new work item involves replacing Part 3 with ANSI X3.285.
Part 4 - Data Definitions • A data definition shall: • Be unique (within a data dictionary) • Be stated in the singular • State what the concept is, rather than what it is not • Be stated as a descriptive phrase or sentence(s) • Contain only commonly understood abbreviations • Be expressed without embedding definitions of other data elements or underlying concepts
Data Definition Guidelines • A data definition should: • State the essential meaning of the concept • Be precise and unambiguous • Be concise • Be able to stand alone • Be expressed without embedding rationale, functional usage, domain information or procedural information • Avoid circular reasoning • Use consistent terminology and structure for related definitions
Part 5 - Naming and Identification Five attributes to identify a data element • Name • Context • Registration Authority Identifier • Data Identifier • Version Identifier Always paired International Registration Data Identifier
Principles for Registration of Data • Each data element has a unique identifier within the register of a Registration Authority. • A data element is uniquely identified by • Registration authority identifier • Data identifier • Version identifier • To be assigned an identifier, the element must be derived, attributed, defined, named, and registered according to ISO/IEC 11179. • A data element shall have at least one name within a context. Combined
Naming Data Elements • Naming principles are described in general terms with examples furnished. • Rules are derived from the principles by which standard names are developed. • These rules form a naming convention. • Because syntax, semantic and lexical rules vary by organization, such as corporations or standards-setting bodies for business areas, no specific naming convention rules are prescribed in the International Standard. • The naming principles described in the standard can be applied to other entities, such as attributes and objects.
Rule Types Counterparts in the X3.285 metamodel • Data element names are formed of components. • The components are: • object class terms • property terms • representation terms • qualifier terms. • Each is assigned meaning (semantics) and relative or absolute position (syntax) within a name. • They are subject to lexical rules.
QUALIFIER TERMS: Trading partner Naming Component Example REPRESENTATION TERM: OBJECT CLASS TERM: Name Country PROPERTY TERM: Identifier NAME: Trading partner country name
Part 6 - Registration Meta Data Registration Principles • Non exclusive registration: Every organization may be a Registration Authority. • Data sharing registration: Data may be shared intra- or inter-organizationally. • Economically enforced registration: Utility determines longevity and usefulness. • Flexible Registration: Meta data may be registered at different levels of quality.
Retired Standardized Certified Recorded Incomplete Registration Status
X3.285 - Metamodel • Promote sharing of metadata for • understanding (meaning, representation, identification) • discovery • harmonization • reuse • analysis • Provide a common base for metadata registries • management structure • components for interchange
Stewardship Data ElementAdministration Data Element Concept Administration Conceptual &Value DomainAdministration Naming &Identification Classification Metamodel Regions
Data Element Model DATA ELEMENT DataElementConcept Object Class Property Conceptual Domain Value Meaning PermissibleValues Data ValueDomain Representation Class Data Element Representation
Data Element Concept Representation Class Representation class name 1 0..* 0..* 0..* 1 1 Data Value Domain Conceptual Domain value domain name conceptual domain identifier value domain character set name value domain minimum character quantity value domain maximum character quantity 1 1 1..* 1..* value domain dependency description value domain format 1 1 +contains +contains 1 1 enumerated value domain enumerated conceptual domain 2..n 2..n 2..n 2..n +contained in +contained in Permissible Value Value Meaning +means +represents permissible value label value meaning identifier (VMID) permissible value begin date value meaning descriptor permissible value end date value meaning begin date 1 1 1 1 value meaning end date
Future Extensions & Work • Promotion of X3.285 to an ISO standard • Completion of TR 15452 - Data Value Domains • XML Tags • Content consistency • Extended classification/terminology support • Object extensions
DTR 15452 -Specification of Data Value Domains Definition: Value Domain • A set of permissible values. • Types • Enumerated • Countries of the world • Non-Enumerated • All Real Numbers Between 0 & 1 • 17 Char Alpha-Num • YYYYMMDD DTR = Draft Technical Report
Value Domain Examples • Geographic Codes • Chemical Names • Biological Classification
TheProblem How can data values be mapped among representations so that the equivalent semantic meaning is determined, even if the language, format or character set of the representations differ? The Benefits The sharing and reuse of data through equivalent data values will allow information to be exchangedfasterand moreefficiently. Sets of reusable domain values, with unique identifiers assigned, eliminate the need for exact representation matches.
Scope of the TR Attributes for identification, specification, development and reuse of data value domains for data elements. Assigning a unique identifier to each value within a domain. Defining a data element conceptual domain and describing mappings between the values of a conceptual domain and the values of each representational data value domain. Defining reuse of value domains among data elements.
Data Element Concept Representation Class Representation class name 1 0..* 0..* 0..* 1 1 Data Value Domain Conceptual Domain value domain name conceptual domain identifier value domain character set name value domain minimum character quantity value domain maximum character quantity 1 1 1..* 1..* value domain dependency description value domain format 1 1 +contains +contains 1 1 enumerated value domain enumerated conceptual domain 2..n 2..n 2..n 2..n +contained in +contained in Permissible Value Value Meaning +means +represents permissible value label value meaning identifier (VMID) permissible value begin date value meaning descriptor permissible value end date value meaning begin date 1 1 1 1 value meaning end date
Data Element Concept Representation Class Representation class name 1 0..* 0..* 0..* 1 1 Data Value Domain Conceptual Domain value domain name conceptual domain identifier value domain character set name value domain minimum character quantity value domain maximum character quantity 1 1 1..* 1..* value domain dependency description value domain format 1 1 +contains +contains 1 1 enumerated value domain enumerated conceptual domain 2..n 2..n 2..n 2..n +contained in +contained in Permissible Value Value Meaning +means +represents permissible value label value meaning identifier (VMID) permissible value begin date value meaning descriptor permissible value end date value meaning begin date 1 1 1 1 value meaning end date Conceptual Level: Object class and Property Logical Level: Representation with addition of qualifier, Application Level
Conclusion Application of all principles of the ISO 11179 family to the development of meta data registries allows easy and effective exchange of data and meta data nationally and internationally.