1.01k likes | 1.02k Views
Explore the importance of data governance maturity in business operations, presented by Ron Daniel, Jr. Learn about metadata, taxonomy, and data governance practices for better information management. Gain insights through self-assessment tools and industry comparisons.
E N D
Data Governance Maturity:When the business depends on clear description of fuzzy objects Presented to San Francisco DAMA Sept. 10, 2008 Ron Daniel, Jr.
Bio: Ron Daniel, Jr. • Over 15 years in the business of metadata & automatic classification • Principal, Taxonomy Strategies • Standards Architect, Interwoven • Senior Information Scientist, Metacode Technologies (acquired by Interwoven, November 2000) • Technical Staff Member, Los Alamos National Laboratory • Metadata and taxonomies community leadership. • Chair, PRISM (Publishers Requirements for Industry Standard Metadata) working group • Acting chair, XML Linking working group • Member, RDF working groups • Co-editor, PRISM, XPointer, 3 IETF RFCs, and Dublin Core 1 & 2 reports.
Recent & current projects: http://www.taxonomystrategies.com/html/clients.htm Government Commercial Not-for-Profit
Goals for this talk • Provide you with background on maturity models. • Provide the results of our surveys of Search, Metadata, & Taxonomy practices and discuss interesting findings. • Review the practices in use at stock photo houses, and compare them to methods that may be used in typical information management projects. • Give you the tools to do a simple self-assessment of your organization’s metadata maturity
Agenda 9:15 Metadata Definitions 9:30 Maturity Models 9:45 Metadata Maturity Model (ca. 2006) 10:15 Break 10:30 Stock Photo Business 10:40 Data Governance Practices in Stock Photo Agencies 11:40 Summary 11:45 Questions 12:00 Adjourn
Taxonomy and metadata definitions Metadata • “Data about data”. • Different communities have very different assumptions about they types of data being described. • I’m from the Information Science community, not the database, statistics, or massive storage communities. Taxonomy • The classification of organisms in an ordered system that indicates natural relationships. • The science, laws, or principles of classification; systematics. • Division into ordered groups, categories, or hierarchies.
Examples of taxonomy used to populate metadata fields Metadata Values (Facets within the overall Taxonomy) Audience Internal Executives Managers External Suppliers Customers Partners Topics Employee Services Compensation Retirement Insurance Further Education Finance and Budget Products and Services Support Services Infrastructure Supplies Metadata Title Author Department Audience Topic
Example faceted taxonomy ABC Computers.com Content Type Competency Industry Service Product Family Audience Line of Business Region-Country Award Case Study Contract & Warranty Demo Magazine News & Event Product Information Services Solution Specification Technical Note Tool Training White Paper Other Content Type Business & Finance Interpersonal Development IT Professionals Technical Training IT Professionals Training & Certification PC Productivity Personal Computing Proficiency Banking & Finance Communica-tions E-Business Education Government Healthcare Hospitality Manufacturing Petro-chemocals Retail / Wholesale Technology Transportation Other Industries Assessment, Design & Implementation Deployment Enterprise Support Client Support Managed Lifecycle Asset Recovery & Recycling Training Desktops MP3 Players Monitors Networking Notebooks Printers Projectors Servers Services Storage Televisions Non-ABC Brands • All • Business • ABC Employee • Education • Gaming Enthusiast • Home • Investor • Job Seeker • Media • Partner • Shopper • First Time • Experienced • Advanced • Supplier All Home & Home Office Gaming Government, Education & Healthcare Medium & Large Business Small Business All Asia-Pacific Canada ABC EMEA Japan Latin America & Caribbean United States
Pop Quiz On a blank piece of paper: • What question(s) did you want to have answered by coming to today’s talks? Flag one question to be discussed later. You do NOT have to provide your name. Please DO provide your job title, division, and either company name or company type.
What do other people ask about? • How to build a taxonomy? • Definitions of terms. • How to govern its use and maintenance? • What’s the ROI? • What are they for? • How do we put them to use? • How do we link them to content? • How do they help search? • How do I sell management on a taxonomy project? • How do we maintain them? and many more…
Agenda 9:15 Metadata Definitions 9:30 Maturity Models 9:45 Metadata Maturity Model (ca. 2006) 10:15 Break 10:30 Stock Photo Business 10:40 Data Governance Practices in Stock Photo Agencies 11:40 Summary 11:45 Questions 12:00 Adjourn
Organizational benchmarking • A common goal of organizations is to ‘benchmark’ themselves against other organizations. • Different organizations have: • Different levels of sophistication in their planning, execution, and follow-up for CMS, Search, Portal, Metadata, and Taxonomy projects. • Different reasons for pursuing Search, Metadata, and Taxonomy efforts • Different cultures • Benchmarks should be to similar organizations.
Is unnecessary capability harmful? • Tool Vendors continue to provide ever-more capable tools with ever-more sophisticated features. • But we live in a world where a significant fraction of public, commercial, web pages don’t have a <title> tag. • Organizations that can’t manage <title> tags stand a very poor chance of putting an entity extractor to use, which requires some ongoing management of the lists of entities to be extracted. • Organizations that can’t create and maintain clean metadata can’t put a faceted search UI to good use. • Unused capability is poor value-for-money. • Organizations over-spend on tools and under-spend on staff & processes.
Towards better benchmarking… • Wanted a method to: • Generally identify good and bad practices. • Help clients identify the things they can do, and the things that stand an excellent chance of failing. • Predict likely sources of problems in engagements. • We have started to develop a Metadata Maturity Model, inspired by Maturity Models from the software industry. • To keep the model tied to reality, we are conducting surveys to determine the actual state of practice around search, metadata, taxonomy, and supporting business functions such as staffing and project management.
A Tale of Two Software Maturity Models • CMMI (Capability Maturity Model Integration) • vs. • The Joel Test
CMMI structure • Maturity Models are collections of Practices. • Main differences in Maturity Models concern: • Descriptivist or Prescriptivist Purpose • Degree of Categorization of Practices • Number of Practices (~400 in CMMI) Source: http://chrguibert.free.fr/cmmi
22 Process Areas, keyed to 5 Maturity Levels… • Process Areas contain Specific and Generic Practices, organized by Goals and Features, and arranged into Levels • Process Areas cover a broad range of practices beyond simple software development • CMMI Axioms: • Individual processes at higher levels are AT RISK from supporting processes at lower levels. • A Maturity Level is not achieved until ALL the Practices in that level are in operation.
CMMI Positives • Independent audits of an organization’s level of maturity are a common service • Level 3 certification frequently required in bids • “…compared with an average Level 2 program, Level 3 programs have 3.6 times fewer latent defects, Level 4 programs have 14.5 times fewer latent defects, and Level 5 programs have 16.8 times fewer latent defects”. • Michael Diaz and Jeff King – “How CMM Impacts Quality, Productivity,Rework, and the Bottom Line” • ‘If you find yourself involved in product liability litigation you're going to hear terms like "prevailing standard of care" and "what a reasonable member of your profession would have done". Considering the fact that well over a thousand companies world-wide have achieved level 3 or above, and the body of knowledge about the CMM is readily available, you might have some explaining to do if you claim ignorance’. Linda Zarate in a review of A Guide to the Cmm: Understanding the Capability Maturity Model for Software by Kenneth M. Dymond
CMMI Negatives • Complexity and Expense • Reading and understanding the materials • Putting it into action – identifying processes, mapping processes to model, gathering required data, … • Audits are expensive • CMMI does not scale down well to small shops • Has been accused of restraint of trade
At the other extreme, The Joel Test Developed by Joel Spolsky as reaction to CMMI complexity Positives - Quick, easy, and inexpensive to use. Negatives - Doesn’t scale up well: Not a good way to assure the quality of nuclear reactor software. Not suitable for scaring away liability lawyers. Not a longer-term improvement plan. The Joel Test Do you use source control? Can you make a build in one step? Do you make daily builds? Do you have a bug database? Do you fix bugs before writing new code? Do you have an up-to-date schedule? Do you have a spec? Do programmers have quiet working conditions? Do you use the best tools money can buy? Do you have testers? Do new candidates write code during their interview? Do you do hallway usability testing? Scoring: 1 point for each ‘yes’. Scores below 10 indicate serious trouble.
What does software development “Maturity” really mean? • A low score on a maturity audit DOES NOT mean that an organization can’t develop good software • It DOES mean that whether the organization will do a good job depends on the specific mix of people assigned to the project • In other words, it sets a floor for how bad an organization is likely to do, not a ceiling on how good they can do • Probability of failure is a good thing to know before spending a lot of time and money
Towards a Metadata Maturity Model • Caveats: • Maturity is not a goal, it is a characterization of an organization’s methods for achieving its core goals. • Mature processes impose expenses which must be justified by consequent cost savings, revenue gains, or service improvements. • Nevertheless, Maturity Models are useful as collections of best practices and stages in which to try to adopt them.
Basis for initial maturity model • CEN study on commercial adoption of Dublin Core • Small-scale phone survey • Organizations which have world-class search and metadata externally • Not necessarily the most mature overall processes or the best internal search and metadata • Literature review • Client experiences • Structure from software maturity models
Initial Metadata Maturity Model (ca. May, 2005) 37 Practices, Categorized by Area, Level, and Importance
Shortcomings of the initial model • No idea of how it corresponds to actual practice across multiple organizations • Some indications that it over-emphasized the sophisticated practices and under-emphasized beginning practices. • The initial metadata maturity model can be regarded as a hypothesis about how an organization progresses through various practices as it matures • How to test it? Let’s ask! • Two surveys to date • Surveys are being run in stages because of large number of practices. • Ask about future, current, and former practices to gather information on progression
Agenda 9:15 Metadata Definitions 9:30 Maturity Models 9:45 Metadata Maturity Model (ca. 2006) 10:15 Break 10:30 Stock Photo Business 10:40 Data Governance Practices in Stock Photo Agencies 11:40 Summary 11:45 Questions 12:00 Adjourn
Survey 1: Search, Metadata, & Taxonomy Practices • The data in this section comes from a survey conducted in the autumn of 2005.
Metadata Practices These two questions were the only ones with much correlation to organization size
Survey 2: Business Drivers, Processes, and Staffing • The data in this section comes from a survey conducted in the spring of 2006.
Business Drivers: Search, Metadata, and Taxonomy (SMT) Applications
Business Drivers: Desired Benefits Other desired benefits:
Processes Use of search logs is improving Surprisingly sophisticated Basic data quality and communications need improvement Many solo operators
Notes from Participants • There is the constant struggle with individual [magazine] titles to hire trained librarians or data specialists instead of trying to save money by hiring an editor who can build articles AND create and assign metadata. This is a governance issue we have been struggling with since we have no monetary stake in the individual publications. We make recommendations, but have no higher level authority to require titles to hire trained staff for metadata. • Reporting metrics have become a new area of confusion as we move to portalized pages consisting of objects in portlets, each with their own metadata. • Key organizational issue is that the "problems" that stem from lack of systematic metadata/taxonomy creation are not "owned" by anyone, and consequently have no budget for their solution.