150 likes | 180 Views
DATA QUALITY The general method. Data model. measure. Non-conform data. correct. Corrected data / improved IS. prevent. Corrected programs Exceptions management. MEASURE DATA QUALITY. ?. fact. ?. ?. schema. DB. Treatment. ?. ?. ?. ?. ?. Extraction system. Data
E N D
DATA QUALITYThe general method Data model measure Non-conform data correct Corrected data / improved IS prevent Corrected programs Exceptions management
MEASURE DATA QUALITY ? fact ? ? schema DB Treatment ? ? ? ? ? Extraction system Data acquisition ? The data model is the central point for all actions
model A model A MEASURE DATA QUALITY application A programs information system quality DB data data quality application B programs DB information system quality data data quality real world the organisation model (A+B+ functional links) organistion information system quality
DATA QUALITYThe general method Data model measure Non-conform data correct Corrected data / improved IS prevent Corrected programs Exceptions management
TO CORRECT • For the data • Concept inadequacy • Fields segmentation and normalization • Fields value cleaning • orphan data detection • Occurrences deduplication • For the Information system • Data model and application improvements
TO PREVENT The deployment of the data quality process must allow : • To clean up the bottom of the river punctually • To dam up the arrival of new information flows of doubtful quality
DATA QUALITYThe general method Data model measure Non-conform data correct Corrected data / improved IS prevent Corrected programs Exceptions management
TO PREVENT • Objective :to (re)organize the data flows in order to guarantee a given quality level , so to minimize the corrective process. • Principle : data are products coming from a production line. For this reason, one should apply the quality control principles applied in the industry. • measure at different spots • validation referenced with external world measures • … • Involved the organization (management, administrative process) as well as technology • People and organisation resistance are important to consider
TO PREVENT • Technical issue • Program correctionCorrection des programmes • Data dictionary consolidation (complete méta-data) • DB re-engineering • Organizational issue • Identification of the processes and data flows • Identification of the critical points and the responsabilities • Users training • Organizational restructuring : flow
The data quality steps according to Gartner data profiling standar- disation de duplication cleaning enrichment follow up The added value of the proposed approach Rules definition Data merge Programs correction Exceptions management Data profiling Model evolution Data dictionary Reverse- engineering Concepts precision SYNTHESIS measure correct prevent « Orphan » data detection Logical Data extraction
Data profiling Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the data To manage the data dictionary Data profiling Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the data To manage the data dictionary Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the programs To manage the exceptions Reverse- engineering To specify and complete the concepts To specify and complete the rules To correct the programs To manage the exceptions Synthesis measure correct prevent What needs to be done 1 2 3 4