80 likes | 337 Views
Oracle Enterprise Data Quality. Introduction to Parsing. What is Parsing?. “The application of business rules and semantic intelligence to data in order to understand and validate it en masse and, if required, improve its structure in order to make it fit for purpose.”
E N D
Oracle Enterprise Data Quality Introduction to Parsing
What is Parsing? • “The application of business rules and semantic intelligence to data in order to understand and validate it en masse and, if required, improve its structure in order to make it fit for purpose.” • Commonly used to structure and prepare data before matching.
Typical Business Problems (1) • Data extraction • E.g. who do I sell to? • Extract all customer names into single attribute. • Communicate with accuracy. • Avoid sending inappropriate communications. • E.g. to deceased customers. • Bad public relations and possible legal issues.
Typical Business Problems (2) • Data migration: • Single name field > structured columns. • Data is now structured. Original system New system
Typical Business Problems (3) • Clean Data. • E.g. For better matching. Remove or move inappropriate data. Standardize abbreviations. Remove or move personal names hidden in company names.
Enterprise Data Quality Text Analysis (1) • Application of business rules and semanticintelligence. • Understanding and transforming data: • Names, Addresses, Product descriptions etc. • “Does this list contain onlynames?” • “How good is my data?”
Enterprise Data Quality Text Analysis (2) • Generic capability: • Configure Parse processor to solve specific problem. • Apply own business rules. • Pre-configured Parse Processors are available. • Starting point for tailored parsing.
Example: Parse a Full Name String • Parse free-text Full Name into ordered columns.