150 likes | 180 Views
Learn how to measure, correct, prevent data inaccuracies, improve data models, manage exceptions, ensure data quality in systems, and optimize organizational information flow quality. Implement data cleansing techniques, improve data models, tackle technical and organizational issues, and deploy Gartner-approved data quality steps.
E N D
DATA QUALITYThe general method Data model measure Non-conform data correct Corrected data / improved IS prevent Corrected programs Exceptions management
MEASURE DATA QUALITY ? fact ? ? schema DB Treatment ? ? ? ? ? Extraction system Data acquisition ? The data model is the central point for all actions
model A model A MEASURE DATA QUALITY application A programs information system quality DB data data quality application B programs DB information system quality data data quality real world the organisation model (A+B+ functional links) organistion information system quality
DATA QUALITYThe general method Data model measure Non-conform data correct Corrected data / improved IS prevent Corrected programs Exceptions management
TO CORRECT • For the data • Concept inadequacy • Fields segmentation and normalization • Fields value cleaning • orphan data detection • Occurrences deduplication • For the Information system • Data model and application improvements
TO PREVENT The deployment of the data quality process must allow : • To clean up the bottom of the river punctually • To dam up the arrival of new information flows of doubtful quality
DATA QUALITYThe general method Data model measure Non-conform data correct Corrected data / improved IS prevent Corrected programs Exceptions management
TO PREVENT • Objective :to (re)organize the data flows in order to guarantee a given quality level , so to minimize the corrective process. • Principle : data are products coming from a production line. For this reason, one should apply the quality control principles applied in the industry. • measure at different spots • validation referenced with external world measures • … • Involved the organization (management, administrative process) as well as technology • People and organisation resistance are important to consider
TO PREVENT • Technical issue • Program correctionCorrection des programmes • Data dictionary consolidation (complete méta-data) • DB re-engineering • Organizational issue • Identification of the processes and data flows • Identification of the critical points and the responsabilities • Users training • Organizational restructuring : flow
The data quality steps according to Gartner data profiling standar- disation de duplication cleaning enrichment follow up The added value of the proposed approach Rules definition Data merge Programs correction Exceptions management Data profiling Model evolution Data dictionary Reverse- engineering Concepts precision SYNTHESIS measure correct prevent « Orphan » data detection Logical Data extraction
Data profiling Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the data To manage the data dictionary Data profiling Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the data To manage the data dictionary Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the programs To manage the exceptions Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the programs To manage the exceptions Synthesis measure correct prevent What needs to be done 1 2 3 4