220 likes | 448 Views
Oracle Data Integrator Data Quality (Integrity Control). 11. Objectives. After completing this lesson, you will:. Know the different types of data quality business rules ODI manages. Be able to enforce data quality with ODI . Understand how to create constraints on datastores.
E N D
Oracle Data IntegratorData Quality (Integrity Control) 11 6-1
Objectives After completing this lesson, you will: • Know the different types of data quality business rules ODI manages. • Be able to enforce data quality with ODI. • Understand how to create constraints on datastores. 6-2
When to Enforce Data Quality? • The IS can be broken into 3 sub-systems • Source application(s) • Data integration process(es) • Target application(s) • Data Quality should be managed in all three sub-systems • ODI provides the solution for enforcing quality in all three. 6-3
Data Quality Business Rules • Defined by designers and business analysts • Stored in the Metadata repository • May be applied to application data • Defined in two ways: • Automatically retrieved with other metadata • Rules defined in the databases • Obtained by reverse-engineering • Manually entered by designers • User-defined rules 6-4
From Business Rules to Constraints • De-duplication rules • Primary Keys • Alternate Keys • Unique Indexes • Reference rules • Simple: column A = column B • Complex: column A = function(column B, column C) • Validation rules • Mandatory Columns • Conditions 6-5
SALES Errors Integration Process Errors Static Control is started - by Interfaces after integration - by Packages - manually Flow Control is started - by Interfaces during execution CORRECTIONS File Source Target ORDERS Error Recycling is performed - by Interfaces LINES Overview of the Data Quality System Static Control is started - Automatically (scheduled) - manually 6-6
Constraints in ODI • Mandatory Columns • Keys • Primary Keys • Alternate Keys • Indexes • References • Simple: column A = column B • Complex: column A = function(column B) • Conditions 6-7
Mandatory Columns • Double-click the column in the Models view. • Select the Control tab. • Check the Mandatory option. • Select when the constraint should be checked (Flow/Static). 6-8
Keys • Select the Constraints node under the datastore. • Right-click, select Insert Key. • Fill in the Name. • Select the Key or Index Type • Go to the Columns tab • Add/remove columns from the key. 6-9
Checking Existing Data with a New Key • Go to the Control tab. • Select whether the key is Defined in the Database, and is Active • Select when the constraint must be checked (Flow/Static). • Click the Check button to perform a synchronous check of the key. Number of duplicate rows 6-10
Creating a Reference • Select the Constraints node under the datastore • Right-click, select Insert Reference • Fill in the Name • Select the reference type • User Reference • Complex Reference • Select a Parent Model and Table • Set the model and table to <undefined> to manually enter the catalog, schema and table name. 6-11
Creating a User Reference • Go to the Columns tab • Click the Add button • Select the column from the Foreign Key table. • Select the corresponding column from the Primary Key table. • Repeat for all column pairs in the reference. 6-12
Creating a Complex Reference • Go to the Expression tab • Set the Alias for the Primary Key table. • Code the Expression • Prefix with the tables aliases • Use the Expression Editor. 6-13
Checking Existing Data with a New Reference • Go to the Control tab. • Choose when the constraint should be checked (Flow/Static). • Click the Check button to immediately check the reference. • Not possible for heterogeneous references. 6-14
Creating a Condition • Right-click Constraints node, select Insert Condition • Fill in the Name. • Select ODI Condition type. • Edit the condition clause • Use the Expression Editor • Type in the error message for the condition. 6-15
Checking Existing Data with a New Condition • Go to the Control tab • Select when the constraint must be checked (Flow/Static). • Click the Check button to perform a synchronous check of the condition. 6-16
How to Enforce Data Quality in an Interface The general process: • Enable Static/Flow Control • Set the options • Select the Constraints to enforce • Table constraints • Not null columns • Review the erroneous records 6-17
How to Enable Static/Flow Control • Go to the interface’s Flow tab. • Select the target datastore. • The IKM properties panel appears. • Set the FLOW_CONTROL and/or STATIC_CONTROL IKM options to “Yes”. • Set the RECYCLE_ERRORS to “Yes”, if you want to recycle errors from previous runs 6-18
How to Set the Options • Select the interface’s Controls tab. • Select a CKM. • Set up the CKM Options. • Set the Maximum Number of Errors Allowed. • Leave blank to allow an unlimited number of errors. • To specify a percentage of the total number of integrated records, check the % option. 6-19
How to Select Which Constraints to Enforce For flow control: • For most constraints: • Select the interface’s Controls tab. • For each constraint you wish to enforce, select Yes. • For Not Null constraints: • Select the interface’s Diagram tab. • Select the Target datastore column that you wish to check for nulls. • In the column properties panel, select Check Not Null. 6-20
How to Review Erroneous Records First, execute your interface. To see the number of records: • Select the Execution tab. • Find the most recent execution. • The No. of Errors encountered by the interface is displayed. To see which records were rejected: • Select the target datastore in the Models view. • Right-click > Control > Errors… • Review the erroneous rows. 6-21