520 likes | 661 Views
Training Course for Developing Data Standards. Information Management Division, ARD-300. July 2009. Developing Data Standards Course Objectives. When you complete this course, you will be able to: Define metadata Describe the major components of ISO/IEC 11179
E N D
Training Course for Developing Data Standards Information Management Division, ARD-300 July 2009
Developing Data StandardsCourse Objectives When you complete this course, you will be able to: • Define metadata • Describe the major components of ISO/IEC 11179 • Create Administered Items in the FAA Data Registry
Developing Data StandardsIntended Audience This course is intended for those individuals who will play a role in standardizing data, namely data stewards and other technical staff with data management responsibilities (e.g. Data Architects, Data Modelers, etc.) therefore some knowledge of sound data management practices is assumed. For those who would like more information on FAA data standardization process, you may refer to FAA Data Standardization Handbook (FAA-HDBK-007). FAA-HDBK-007 and other related standards can be located at http://ato-p.se-apps.faa.gov/faastandards/FAADocs.htm
Developing Data StandardsCOURSE OUTLINE Introduction Lesson 1: ISO 11179 Lesson 2: FDR Implementation of ISO-11179 Lesson 3: Creating Data Element Names Lesson 4: Getting Started Lesson 5: Using the FDR Reference Materials
Developing Data StandardsIntroduction Data Standards at the FAA Future aviation depends on modernized and highly unified services to maintain safe, secure, and efficient flight in the face of current and future challenges in the global aviation system and the aviation industry. In view of the critical nature of the FAA’s mission, the quality and reliability of its information and information systems are of the utmost importance. Data are the fundamental components of information and are considered critical resources. To ensure the quality and reliability of the data resources, data must be understood by all areas requiring the data. Data must be consistently described or standardized to support uniform identification, definition, classification, management, protection, and interchange of data elements, and other data concepts. Data standards support the sharing and exchange of data throughout the agency, as well as with other agencies, the international aviation community, and public citizens.
Developing Data StandardsIntroduction (Continued) Data standards facilitate discovery, understanding, and sharing of data • They enable transparency and understanding – use of standards promotes common, clear meanings for data that is often reused • They enable access - the same well understood terms, codes, and data structures can be used for data retrieval • They encourage and enable reuse of data and software for multiple purposes • Mappings to standards allow comparisons even when data isn’t standardized • They provide consistent results during data retrieval • They give managers and staff the ability to manage metadata from multiple systems through out the systems’ lifecycle • They enable modernization of systems and supports international data harmonization efforts Data Standards at the FAA
Developing Data StandardsLESSON 1 - ISO 11179 In this lesson, we will: • Define metadata • Introduce ISO (International Organization of Standardization) • Describe the ISO/IEC 11179 Metadata Standard • Discuss the benefits of using ISO/IEC 11179 • Define ISO/IEC 11179 terms and definitions
ISO 11179 Lesson 1: What is Metadata? • Metadata is data about data. The concept of metadata is often confusing, partly because it lacks a clear definition. • Metadata is a type of data that describes and defines other data, but what makes it different from data instances is how it is used. It is found in documents, messages, images, sound streams, and videos. The term metadata also refers to data that are used to describe a data set, such as the content, quality, and condition of data. It is the information that answers questions like: • Who owns the data? • What is the meaning of the data? • How was the data collected? • How is the data named and identified? • How is the data represented? • It is a set of facts about data and other information elements. It is everything except for the data itself, and it is undeniably important.
ISO 11179 Lesson 1: Metadata is… • Metadata is a Communications Enabler • Metadata must describe Concept as well as Format (representation) • E.g., consider the data instance “115.5” – it’s a number, but what does it mean? • 115.5 • Frequency: 115.5 • VOR Frequency: 115.5 • VOR Frequency in KHz: 115.5 • Meaning becomes clearer as metadata is added • Metadata is a resource! • FAA data is a resource that must be managed from an enterprise perspective, and treated as an agency-wide asset. • FAA is responsible for the timeliness, accuracy, understandability, availability, and security of the data under their stewardship.
ISO 11179 Lesson 1: Introduction to ISO ISO: International Organization for Standardization • ISO is a non-government network of the national standards institutes of 151 countries • ISO develops International Standards • ISO standards specify the requirements for products, services, processes, materials, systems and conformity assessment • For example: mathematics, manufacturing, electrical mechanical and civil engineering, imaging, electronics, and information technology • http://www.iso.org
ISO 11179 Lesson 1: The ISO/IEC 11179 Standard • ISO/IEC 11179 specifies the kind and quality of metadata necessary to describe data, and it specifies the management and administration of that metadata in a metadata registry (MDR). • ISO/IEC 11179 applies to the formulation of data representations, concepts, meanings, and relationships between them, independent of the organization that produces the data. • ISO/IEC 11179 does NOT apply to the physical representation of data as bits and bytes at the machine level. • ISO/IEC 11179 provides the rules and guidelines for naming and identification of data elements. It defines the identifying attributes, describes the relationship of the attributes to each other and includes principles by which naming conventions can be developed. • ISO Standard: http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html
ISO 11179 Lesson 1: Benefits of Using ISO 11179 • The use of data standards for data capture and reporting, facilitates the understanding and sharing of information. • Though complex, the ISO/IEC 11179 standard offers a richly expressive model for metadata that fully supports the variations needed for FAA applications.
ISO 11179 Lesson 1: Benefits of Using Metadata Registries • A Data Registry is a "storage container" that provides important information about data that is called metadata. The metadata includes: • Semantic information that can help a user understand the data's meaning • Representational information that specifies the form of presentation; e.g., format, length, permitted values, and data type • Naming and identification information • Specifications for standard data to be incorporated in new and reengineered systems to promote data interchange • Administrative information about the metadata steward and the quality of the metadata • It is a tool designed to: • Promote uniform data documentation • Foster information integration and sharing • Support implementation of data standards • Follow international standards in its design • It does not contain the actual data from information systems but rather the information about that data – the metadata that enables a user to better understand, access, and share system’s information.
ISO 11179 Lesson 1: Benefits of Using Metadata Registries • It serves as a: • Repository for standard data elements with names, definitions, and format information for system developers to use • Repository for standard code sets (also known as permissible or valid value lists) • Data management tool • It enables architects to analyze information and to reduce redundancies supporting reuse of data. • It organizes data elements by application systems, communities of interest, and common concepts. The data elements are uniquely defined and identified so that information can be shared throughout the enterprise and integrated across enterprise systems.
ISO 11179 Lesson 1: ISO 11179 Terms and Definitions Object Class - An object class is a thing or abstraction in the real world that is desirable to be modeled. It is much like an “entity” in relational terms. (For example: Person, Airport, Aircraft, Facility, etc.) Property - A property is a peculiarity common to all members of an Object Class. It is much like an “attribute” in relational terms, with the important exception that a Property does not have a specified representation. (For example: Elevation, Location, ID, First Name, Last Name, Address, etc.) Data Element Concept - An idea that can be represented in the form of a data element, described independently of any particular representation Value Domain - A set of attributes describing representational characteristics of instance data with or without permissible values Data Element - A unit of data for which the definition, identification, representation and permissible values are specified by means of a set of attributes Context - ISO/IEC 11179 standard defines a context as a “designation or description of the application environment or discipline in which a data standard is applied or from which it originates.” A context may be a business domain, an agency, an information subject area, an information system, a database, file, data model, standard document, or any other environment. Conceptual Domain - A set of possible value meanings of a data element expressed without representation Value Meaning - A member of the set of finite allowed inventory of notions that can be categorized for a conceptual domain Permissible Value - An expression of a value meaning in a specific value domain
Conceptual Domain List-Designator-SurveillanceCapability Object Class AIRCRAFT Valid Values A C D I N ... Data Element Concept Aircraft_SurveillanceEquipmentCategory Value Domain SurveillanceEquipmentCategory_Code-R001 Classification Schemes Value Meanings Transponder - Mode A Transponder - Mode A & Mode C Automatic Dependent Surveillance Transponder - Mode S Nil (no transponder) ... Property SurveillanceEquipmentCategory Data Element Aircraft_SurveillanceEquipmentCategory_Code-R001 Context FAA ISO 11179 Lesson 1: ISO Terms and Definitions (cont’d)
Developing Data StandardsLESSON 2 - FDR IMPLEMENTATION OF ISO-11179 In this lesson we will: • Introduce the notion of a Data Element • Identify the fundamental components of a Data Element • Explain how Data Elements are formed • Discuss metadata attributes of a Data Element
FDR Implementation of ISO-11179 Lesson 2: Data Element Overview • A Data Element is a unit of data for which the definition, identification, representation and permissible values are specified by means of a set of attributes • A Data Element is common to the community, within a Context, can be used across Contexts • A Data Element is the combination of a Data Element Concept and a Value Domain
FDR Implementation of ISO-11179 Lesson 2: Data Element Fundamentals What is it? How do you want to represent it? Common Data Element = + Person Self Reported Age Value Person Self Reported Age Age Value Data Element Data Element Concept Value Domain + = Value Domain Data Element Concept Object Class Property Representation
FDR Implementation of ISO-11179 Lesson 2: How Data Elements are Formed Data Elements* • Conceptual Component (Data Element Concept*) • Object Class* • Property* • Conceptual Domain* • Representational Component (Value Domain*) • Conceptual Domain* • Permissible Values / value domain description • Other details describing how the value is represented, exchanged, and interpreted (format, encoding, units of measure, precision) Data Elements and their subcomponents must also belong to a unique domain (Context*), and may also be classified into categories (Classification Scheme Items belonging to one or more Classification Schemes*) *All these objects, in ISO 11179, are referred to, as “Administered Items”, and all have a common set of core metadata or attributes that the ISO metadata registry standard recommends for collection and registration.
FDR Implementation of ISO-11179 Lesson 2: Data Element Attributes Metadata common to all administered items, including Data Elements: • Identification & descriptive information • Name, Definition, Version, Unique Registration Identifier, etc. • Administrative information • Information of Steward, Submitter, Registrar, Registration and Administrative Status, etc. • Classification information • An administered item may be classified into one or more Classification Scheme items (nodes within a Classification Hierarchy) • Other referential and relationship information • Associated reference documents, relationships to other administered items for complex or structured items Some types of administered items have attributes unique for that type, e.g. • Value Domains • Permissible Values, Format, Length, perhaps even Max/Min Values, Units of Measure etc. • Data Element Concepts and Value Domains • Allow identification of an associated Conceptual Domain
Developing Data StandardsLESSON 3 – CREATING DATA ELEMENT NAMES In this lesson, we will: • Discuss variations in data element formation • Discuss data element component naming conventions
Creating Data Element NamesLesson 3: Example of Forming a Data Element What is a Data Element? • A way to represent information • Can emerge from many different types of information models “Albright” is an “Instance” of “Last Name” “Last Name” is the foundation of our data element * Based on ISO 11179 Part 1
Creating Data Element NamesLesson 3: Example of Forming a Data Element Another way to look at where Data Elements Live in each type of model * Based on ISO 11179 Part 1
Value Domain Data Element Data Element Concept + = Representation Class (“Core Term”) Value Domain Data Element Concept Object Class Property Creating Data Element NamesLesson 3: Naming Convention Concept Name = ObjectClassName_PropertyName Value Domain Name = PropertyName_CoreTerm-Rnnn, where nnn is for uniqueness Data Element Name = ConceptName_ValueDomainName, which is equivalent to Data Element Name = ObjectClassName_PropertyName_CoreTerm-Rnnn
Conceptual Domain classifies classifies How do you want to represent it? What is it? Data Element Concept Data Element Value Domain + = Value Domain Data Element Concept Representation Object Class Property Table Name Column Name Data Type Data Length + Database DDL information Creating Data Element NamesLesson 3: Naming Convention (cont’d)
Developing Data StandardsLESSON 4 – GETTING STARTED In this lesson, you will learn: • The value in forming a Data Standardization Work Group • How to gather metadata, the sources, definition, format, and other technical, administrative, and business information that identifies and describes the data. • How a data modeling approach will assist in developing data standards in a consistent way. • How to map from a logical data model to ISO 11179 concepts. • How to refine and format the metadata including names, definitions and other attributes. • How to prepare spreadsheets for importing large number of administered items into the FDR.
Getting Started Lesson 4: Form a Work Group • A Work Group is essential to data standardization to support the collection and documentation of metadata for our complex data exchange environments. • A Work Group should consist of the following types of personnel: • A data modeler (someone who can facilitate development of a conceptual data model) • One or more subject matter experts (those who can provide meanings and business rules that need to be captured) • Data Base Administrators (someone who can provide a data dictionary and relationships of data elements that are exchanged) • Data Manager (someone who has knowledge of data management practices, can locate sources for metadata collection, and has knowledge of the FAA Data Registry)
Getting StartedLesson 4: Gather the Key Metadata Gather metadata about the shared data, i.e.., the sources, definition, format, and other technical, administrative, and business information that identifies and describes the data. All required administered items for each data element need to be documented (for additional guidance refer FAA-HDBK-007). The following are examples of useful resources that may be used to extract the required metadata: • Available models (ERM, UML, etc.) covering shareable data over key interfaces • Data Dictionaries (details on views, entities, and their attributes) • Specifications and Interface Control documentation, MOAs, SLAs, etc. • When possible and appropriate, use previously standardized items for conceptual domains, object classes, properties, value domains, etc. • Business documents
Getting StartedLesson 4: Use of a Data Model Although it is not required, it is recommended to develop a data model that groups the items into conceptual objects and the relationships between the objects. The FAA Data Architecture may be used for guidance in establishing higher level conceptual data objects (e.g., entities and attributes, object class and properties) to be used in the model. Get consensus among all stakeholders on the data elements and the model. This model provides the required unique data element names. Metadata for key shareable data Data Model Other agency resources, FAA Data Architecture, Enterprise Information architecture Stewards, data suppliers, SMEs
Getting StartedLesson 4: Manually Enter or Import Items into FDR Have the finalized metadata hand entered or imported into the FAA Data Registry. When there are a large number of items prepare a spreadsheet for importing the data into the registry. or Manual entry when only a few administered items being recorded Import used for recording many administered items
Columns on import sheet include: Context name Context definition Object Class name Object Class definition Property name Property definition Data Element Concept name Data Element Concept definition Value Domain name Value Domain definition Conceptual Domain name Conceptual Domain definition Data Element name Data Element definition Enumeration type (for Conceptual Domains and Value Domains) Additional sheets may be required for separately importing: Permissible values (codes) for enumerated Value Domains Value Meanings (for creating enumerated Conceptual Domains and for mapping to the permissible values of enumerated Value Domains). Note: one Data Element usually gives rise to only one row in the import sheet and upon import will specify or create the other dependent administered items identified in the row and make the necessary relationships among those items. Getting Started Lesson 4: Sample Spreadsheet Used for Import
Developing Data Standards LESSON 5 - USING THE FDR In this lesson, you will learn: • How to access the FDR • How to look up information in the FDR • How to enter new information into the FDR • How to import into the FDR
Using the FDRLesson 5: FDR –Home Page (fdr.gov) Administered Items link for any visitor to query standardized items in FAA Data Registry FDR Library: FDR Detailed Rules of Behavior and FDR Rules of Behavior Acknowledgement required for getting FDR account Link to context sensitive help for current page Notices and Alerts: Alerts of planned maintenance and other usage notes. Link to Sign In page for submitters and stewards with FDR accounts FDR Library: Frequently Asked Questions (FAQ) provides steps for getting access to FDR and other useful answers. Standards Organizations: provides link to ISO/IEC 11179 standard FDR Library: contains FAA Standardization Handbook and other important documents.
Using the FDRLesson 5: Getting an FDR Account (for Submitters and Stewards) Download and Read FDR’s Rules of Behavior Sign and deliver FDR’s Rules of Behavior Acknowledgement Contact the FDR Registrar to establish specific access privileges
Using the FDRLesson 5: FDR – Signing In Link to Sign In page for submitters and stewards with FDR accounts Authenticated users have a Home page with different functions and privileges Tip: Home link is available on most screens to return users to their home page. Tip: Center panel on Home Page contains links to your “Favorite” objects. These are also available from any screen using the Favorites tool bar
Using the FDRLesson 5: Querying Data Standards with the Filter Screen “Administered Items”, the most frequently used favorite, produces a filter screen for querying metadata in the FDR Note: Querying Administered Items is available to authenticated users as well as guest users (i.e., visitors who have not signed in). Guests are limited to read only access to “Standardized” data in the FAA context. Authenticated users may also have create and update access to data not yet standardized and possibly in other contexts.
Using the FDRLesson 5:Querying Data Standards with the Filter screen Dropdowns available to select items from short lists of choices Only items containing the text entered in textbox fields will be included in filtered results Popup windows are used for some list items when a dropdown is not practical or feasible Additional fields available for advanced filtering
Using the FDRLesson 5:Querying Data Standards with the Filter screen Control number of rows in result screens Numbers, triangles and arrows used to navigate to other subsets of the results Edit icon to view or edit details of item in results list 60 items returned for query (Standardized Data Elements with “weather” in the name
Using the FDRLesson 5:Manually Entering New Items From earlier lessons we illustrated how to collect, refine, and name the metadata needed to specify the shareable FAA data that must be registered. We also showed how to get an FDR account with the needed privileges for your specific data standardization activities. When there are only a few items to be entered, manual entry into the FDR is recommended.
Using the FDRLesson 5:Manually Entering New Items Select type of admin item to create Selecting “Next” will produce an admin item entry form for the key details needed for that type of admin item. (see next page for details) Clone a similar existing admin item or leave blank to create a brand new admin item of that type
Using the FDR Lesson 5:Manually Entering New Items Attributes that are required are marked with * or ** (the latter indicates the field is part of the records primary key). Attributes that are common to all administered items (like Name, Definition, etc) always appear in the upper portion of an entry form Text boxes, dropdowns, and pop-ups are provided for metadata details that can be entered. Some attributes are not editable or must be changed using another form Attributes that are unique to the specific administered item being created, appear below the common attributes For Data Elements one can specify to reuse an existing Data Element Concept and/or Value Domain, or select radio buttons to Create New items
Using the FDRLesson 5:Tips for Importing New Items • In lesson 4, a sample import spreadsheet was discussed. The columns are recommended to be in the stated order. • The text fields in the spreadsheet should be stripped of any special word processing codes and symbols, e.g., bullets, tabs, carriage returns, etc. • Remove double quote characters as this will interfere with import processing. • The final import spreadsheet should be saved as a tab delimited text. • If any names, definitions, comments., etc, have any international characters or other special characters (supported by the UTF-8 character set) the text file must be edited and resaved in UTF-8 character encoding. • If after import certain fields contain unusual box-like character symbols, or inverted question mark symbols. It means the import file wasn’t thoroughly stripped of word processing codes or a UTF-8 formatted text file was not used for the final import. • Items imported may still be corrected manually if there are any corrections or additions needed to finalize the administered items.
Using the FDRLesson 5:Importing Data Using a Spreadsheet Browse to, or hand enter the import filename Select Delimiter, usually Tab, Import Type, Text Qualifier ,usually double quote (“), and other parameters about the import text file. Import file is a UTF-8 encoded Tab delimited text version of import sheet Selecting “Next” will import the administered items and display a log of the import on the screen. Check the log to ensure all the desired items were imported.
Using the FDRLesson 5:Steward Assignment Steward Assignment and Admin Record produces a filter screen much like querying Administered Items. Fill in the appropriate fields to display the records that need to be updated.
Using the FDRLesson 5:Steward Assignment Multiple records may be selected using check boxes Bulk Update may be used to set a common attribute in multiple records.
Using the FDRLesson 5:Finishing Touches Additional metadata such as Alternate Names, Reference Documents, Related Elements are specified here
Acknowledgements Portions of this training course were derived with permission from other Federal government agencies, namely the Environmental Protection Agency and National Institutes of Health (National Cancer Institute): National Cancer Institute http://ncicb.nci.nih.gov/NCICB/training/cadsr_training Environmental Protection Agency http://iaspub.epa.gov/sor_internet/registry/sysofreg/home/overview/home.do
Developing Data StandardsReference Documents • FAA Information/Data Management Order, 1375.1D • http://www.faa.gov/documentLibrary/media/Order/1375.1D.pdf • FAA-STD-025, Preparation of Interface Documentation • http://ato-p.se-apps.faa.gov/faastandards/ • FAA-STD-060, Data Standard for the National Airspace System • http://ato-p.se-apps.faa.gov/faastandards/ • Federal Data Registry (FDR) • http://www.fdr.gov/ • FAA Data Standardization Handbook (FAA-HDBK-007) • http://ato-p.se-apps.faa.gov/faastandards/FAADocs.htm • Federal Data Registry’s User Guide • http://www.fdr.gov/ (integrated in FDR user help) • ISO/IEC 11179 standard: Information Technology - Metadata Registries, Parts 1 - 6 • http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html