490 likes | 683 Views
Quality Data An Improbable Dream?. Elizabeth Vannan Centre for Education Information Victoria, BC, Canada. Information quality is a journey, not a destination - Larry P. English. Agenda. Data Definitions and Standards Project What is Quality Data? The Cost of Poor-Quality Data
E N D
Quality DataAn Improbable Dream? Elizabeth Vannan Centre for Education Information Victoria, BC, Canada Quality Data – An Improbable Dream?
Information quality is a journey, not a destination - Larry P. English Quality Data – An Improbable Dream?
Agenda • Data Definitions and Standards Project • What is Quality Data? • The Cost of Poor-Quality Data • Improving Data Quality – Our Process • Questions? Quality Data – An Improbable Dream?
BC Higher Education • Canada’s Western-most province • Population: 4.023 Million • Land Area: 366,795 Sq Miles • Publicly Funded Post-Secondary System • 22 Colleges • 6 Universities Quality Data – An Improbable Dream?
CEISS The Centre for Education Information is an independent organization that provides research and technology services to improve the performance of the BC education system Quality Data – An Improbable Dream?
CEISS • Implement and manage administrative systems • Perform custom surveys, research and analysis • Facilitate development and implementation of data standards • Negotiate and manage province wide software contracts (Oracle, SCT Banner, Datatel) Quality Data – An Improbable Dream?
DDEF Project The Problem • Better data about the BC higher education sector needed for decision-making • No infrastructure in place to facilitate the collection of data electronically Data Definitions and Standards Project Initiated in 1995 Quality Data – An Improbable Dream?
DDEF Project The Solution • Create data standards for all higher education information (Student, HR, Finance) • Develop a data warehouse based on standards for reporting • Implement a common technical infrastructure at all higher education institutions Quality Data – An Improbable Dream?
DDEF Project Project Goals • Improve the quantity and QUALITY of data available • Reduce the number of data and reporting requests • Develop business information system to support the management and evaluation of the BC Post-Secondary system Quality Data – An Improbable Dream?
How Are We Doing? • 16 institutions implemented/implementing • Institutions using data warehouses for internal reporting • Data requests reduced • Ministry using data Quality Data – An Improbable Dream?
Why Focus on Data Quality? • Poor data quality in our data warehouse impacts: • Confidence • Decision making • Funding Quality Data – An Improbable Dream?
Quality Data Are… The Four Attributes of Data Quality Quality Data – An Improbable Dream?
Quality Data Are… • Accurate • Free from errors • Representative Quality Data – An Improbable Dream?
Quality Data Are… • Complete • All values are present Quality Data – An Improbable Dream?
Quality Data Are… • Timely • Recorded immediately • Available when required Quality Data – An Improbable Dream?
Quality Data Are… • Flexible • Data definitions understood • Can be used for multiple purposes Quality Data – An Improbable Dream?
Quality Data… • Don’t have to be perfect • Good enough to fill the business need at a price you’re willing to pay Our Challenge Defining Quality Criteria for Higher Education Data Quality Data – An Improbable Dream?
Cost of Poor-Quality Data • Business Process Costs Incorrect Registrations Inaccurate Tuition Billings Payroll Errors Quality Data – An Improbable Dream?
Cost of Poor-Quality Data • Rework Re-collect Data Correct Errors Data Verification Quality Data – An Improbable Dream?
Cost of Poor-Quality Data • Missed Opportunities Substandard Customer Service Poor Decision Making Loss of Reputation Quality Data – An Improbable Dream?
Data Cleansing Data Quality Assessment Business Practice Change Improving Data Quality Improved Data Quality Business Process Review Quality Data – An Improbable Dream?
Business Process Review • When, where, how is data collected? • Where is data stored? • Who creates data? • Who uses data? • What outputs are required? • What quality checks already exist? Quality Data – An Improbable Dream?
Business Process Review • Involve all stakeholders! • For student data we involve • Executive • Registrars office • IT Department • Institutional Research Quality Data – An Improbable Dream?
Business Process Review • Results • Understanding of business practices • Identification of data creators, custodians, users • Preliminary quality metrics • Problem business practices Quality Data – An Improbable Dream?
Data Quality Assessment • Establish Metrics • Apply metrics to data • Review results Quality Data – An Improbable Dream?
Establish Metrics • For each element determine quality criteria • Acceptable range of values • Acceptable syntax • Comparison to known values • Business rules • Thresholds Quality Data – An Improbable Dream?
Quality Metrics Quality Data – An Improbable Dream?
Applying Metrics • Collect known information for comparison • Develop queries to test each of your validation criteria • We use Oracle Discoverer, but other tools exist (MS Access, SQL) Quality Data – An Improbable Dream?
Applying Metrics Test 1 PEN must be 9 digits long. No characters, no shorter values acceptable Quality Data – An Improbable Dream?
Test 1 Results Two Student Records Contain Invalid PEN Numbers Quality Data – An Improbable Dream?
Test 1 Results Invalid PEN’s Data Entry Error? Can Identify specific students for data cleansing Quality Data – An Improbable Dream?
Applying Metrics Test 2 At least 80% of student records must have valid PEN number Quality Data – An Improbable Dream?
Test 2 Results This Institution Meets the Quality Threshold Quality Data – An Improbable Dream?
Applying Metrics Test 3 No Duplicate PEN’s Quality Data – An Improbable Dream?
Test 3 Results This institution has a BIG problem! Can we see more details? Quality Data – An Improbable Dream?
Test 3 Results Addition information reveals data loading problems Quality Data – An Improbable Dream?
Reviewing Results • Systematic approach needed • Develop strategy for data cleaning • Identify source of data problems Deal with Disparate Data Shock! Quality Data – An Improbable Dream?
Reviewing Results • Insert a quality review checklist Quality Data – An Improbable Dream?
Reviewing Results Quality Data – An Improbable Dream?
Data Cleansing • Location • Administrative System? • Staging Area? • Who • Scope Quality Data – An Improbable Dream?
Typical Data Cleansing • Correcting data entry errors • Removing or correcting nonsensical dates • Deleting “garbage” records • Combining or deleting duplicates • Updating and applying code sets Quality Data – An Improbable Dream?
Business Practice Change • Two components • Implementing changes to improve data quality • Adopting ongoing data quality review process Changing Business Practices is a Challenge Get Stakeholder Support Quality Data – An Improbable Dream?
Business Practice Change • Education • Centralizing responsibility for codes • Consolidating data collection • Implementing validation routines • Change business processes Quality Data – An Improbable Dream?
Quality Review Process • Review data regularly • Make someone responsible • Establish procedures for correcting data problems • Communicate quality improvements Quality Data – An Improbable Dream?
Some Changes in BC • Creation of Data Manager position, responsible for code sets, data quality • Regular education for registration clerks and other data creators • Established relationships between data creators and users • Re-engineered administrative systems Quality Data – An Improbable Dream?
Improvements to BC Data • Improved data quality and quantity • Nonsensical dates almost eliminated • Completeness of key elements improved (from 50% to 80-90%) • Data now being collected for CE in standard format Quality Data – An Improbable Dream?
Final Thoughts… • Quality Data are Probable if you are willing to… • Take a critical look at your existing data • Implement changes to how you collect and manage data • Invest the time to educate and communicate with data users and creators • Make data quality improvement an on-going process Quality Data – An Improbable Dream?
Recommended Reading • Brackett, Michael H., Data Resource Quality, Turning Bad Habits into Good Practices (New York:Addison-Wesley, 2000) • English, Larry P., Improving Data Warehouse and Business Information Quality (New York: John Wiley and Sons, 1999) • Redman, Thomas C., Data Quality for the Information Age (Boston;Artech House, Inc., 1996) Quality Data – An Improbable Dream?
Thank You! Presentation Available At www.ceiss.org or evannan@ceiss.org Quality Data – An Improbable Dream?