270 likes | 502 Views
Data Custodian Forum on Metadata . September 19 th , 2012. Metadata and You. Nancy McQuillen. Agenda. The “Top Problem” for 2012 : Clear Definitions Review Definitions : Data vs. Metadata vs. Master Data How can Metadata help? What types of metadata exist?
E N D
Data Custodian Forum on Metadata September 19th, 2012
Metadata and You Nancy McQuillen
Agenda • The “Top Problem” for 2012: Clear Definitions • Review Definitions: Datavs. Metadata vs. Master Data • How can Metadata help? • What types of metadata exist? • What questions can be answered with metadata? • Short-term goals:Clarifying essential UW business terms • Using TermPoint to submit and comment on Terms and Definitions • Requested Data Custodian inputs and assistance in 2012 • Forum Terminology Exercise– Instructions
Business question: How many students attended UW last quarter? Registered Students? Enrolled Students? ? Precision is required in questions, and in term definitions. For example: Admitted Students? An Enrolled Student is one who had at least one active course registration on or after the first day of the quarter. All courses can be dropped after the first day of the quarter and the student is still considered an Enrolled Student for the quarter. Enrolled as of Census Day? Fee-paying Students?
The Problem: “TERM-inal” Confusion Enrolled Student Home Department Donor Grant Runner Full-time Student Academic Year Principle Investigator Proposal type Cost Share Funder Type Fee Based Major Sponsor Federal Salary Cap Award Tenured Faculty Encumbered Salary Federal Expenditures Facility Carry Forward Adjunct Faculty Course Cohort Community-based Research Faculty FTE Indirect Costs Space Utilization Cooperative Agreement
Elements of Good Definitions A good definition is: • Use words that have a clear, specific meaning. Avoid words that have multiple meanings. Precise • The definition should differentiate a data element from other data elements. This process is called disambiguation. Distinct Concise • Use the shortest description possible that is still clear. • Do not use the term you are trying to define in the definition itself. This is known as a circular definition. Non-circular
Data, Metadata, and Master Data Some short informal definitions: Data – Recorded facts, observations, writings Metadata – Additional descriptors of the data Master data – Important shared data sets
Library Example Dewey Decimal System Card Catalog Books • Metadata attributes: • Author • Title • Subject Master data expresses the system used to classify the books. Metadata describes, identifies, and classifies each book. This is the data, the “information assets”
Defining Metadata • The common short definition: Data about Data • A more comprehensive definition: Data, information, or knowledge about information assets • The purpose of metadata is to “improve the usability of an information asset throughout its life cycle”. (Source: Gartner, 2011)
Examples of Information Assets Structured Data Semi-Structured (Text) Data Models and Diagrams • Process models • Data models • Web pages • Documents, Forms • Data sets • Reports
Data vs. Metadata This is metadata. Expense Object Name This is data. Data without meaning and context are useless. Metadata provides meaning. This is metadata (drawn from a master list of “Expense Objects”).
Metadata also describes Master Data • UW Master Data (examples): • People: Students, Employees • Finance Entities: Organizations, Budgets • Master data represents the most important “things” (usually nouns) of the enterprise. One definition is: • “Master data sets are synchronized copies of core business entities used in traditional or analytical applications across the organization, and subjected to enterprise governance policies, along with their associated metadata, attributes, definitions, roles, connections and taxonomies.” (Source: David Loshin, Knowledge Integrity; retrieved from BeyeNetwork 9/15/2012)
TermPoint Home Page Login using your UW NetID in this format https://sharepoint.washington.edu/oim/EIS/dss/metadata/TermPoint/default.aspx
Terms Entry Enter your terms directly here or send us an Excel spreadsheet (in TermPoint format)
Data Custodian Targets for 2012 • “Top 10” terms (submitted to TermPoint) – • with or without definitions, because we need your knowledge of what matterswithin your areas of expertise. • Technical contacts – in your area • Domain experts that know the data implementations (e.g. databases and file structures), and can assist in technical metadata submission for the important terms. • Input and advice on metadata strategy • Metadata work should follow key UW initiatives. We welcome general comments and ideas on metadata requirements and priorities, from your perspective.
Metadata Team Targets for 2012 • Assist in term harmonization and definition refinement; improve communications • Consolidate glossaries and submitted terms • Search for similar terms across domains and glossaries • Facilitate definition sessions to clarify terms • Document technical implementations • Consolidate technical metadata submitted by domain technical contacts • Facilitate search for appropriate data sources • Improve tooling and procedures for the two activities • Business glossary metadata management • Technical metadata management
T E R M-inal Madness (Terminology Mixer Exercise) Instructions to Forum Participants
Goals • Talk about terms -- with colleagues from within and beyond your custodial domain; find others that share your area’s terms and related data. • Begin to identify important terms • Terms that must be clearly and consistently interpreted by many (departments, campus users, personnel in your department, etc.): • Terms used for analytic reporting, within or beyond UW • Terms that are important for UW process operations • Begin to prioritize terms for definitional work – within and beyond your business domain.
Step 1: Which terms are relevant to Italian cooking? IN OUT tortillas curry garlic fine wine falafels naan pure olive oil marinara stir fry lemon runzas capers salsa chips pasta sushi olives mild herbs
Step 2: Which terms are most important? HIGH MEDIUM LOW mild herbs garlic fine wine pure olive oil marinara olives lemon capers pasta
Steps in the Exercise Step 1. Sort terms as IN/OUT (5 min.) Step 2. Prioritize your IN scope terms (10 min.) • Limit your team’s “High priority” terms to maximum of 5. • Optional: Add additional terms as you wish. Step 3. Take a break, get coffee, think terms! (10 min.) Step 4. Wrap-up discussion, (5 min.) - with Hi/Med/Low score chart.