1 / 26

Introduction to Data quality services

Introduction to Data quality services. Presentation by Tim Mitchell (Artis Consulting) www.TimMitchell.net. Today’s Agenda. Overview of DQS Structure Knowledge Base DQS Project Operations Matching Cleansing Administration SSIS Component Shortcomings. About the Presenter.

pascal
Download Presentation

Introduction to Data quality services

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Data quality services Presentation by Tim Mitchell (Artis Consulting) www.TimMitchell.net

  2. Today’s Agenda • Overview of DQS • Structure • Knowledge Base • DQS Project • Operations • Matching • Cleansing • Administration • SSIS Component • Shortcomings

  3. About the Presenter Tim Mitchell • BI Consultant, Artis Consulting • North Texas SQL Server User Group • SQL Server MVP • Contributing author, MVP Deep Dives Vol 2 • Coauthor, SSIS Design Patterns • TimMitchell.net | twitter.com/Tim_Mitchell

  4. Housekeeping • Questions • Surveys

  5. Overview of Data Quality Services

  6. What is DQS? • DQS is a knowledge driven data cleansing and matching services • Built on top of SQL Server 2012 • Simple yet powerful interface

  7. What is DQS?

  8. What is DQS? • Replaces manual data quality work you’re already doing • Stored procedures • Triggers • Custom applications

  9. DQS Structure

  10. DQS Structure and Flow Knowledge Base Cleansing Project Matching Project Domains Matching Policies Cleansing Project Matching Project Composite Domains Cleansing Project

  11. Knowledge Base • Starting point for data quality provisioning • Uses locally customized data stores or marketplace data sources • Highly reusable and evolutionary • Key elements: • Domains • Matching policies

  12. Knowledge Base • Create by: • Knowledge discovery • Domain management • Matching rule

  13. Knowledge Base

  14. Domains • Domain = data field • Domain rules • Composite domains • Allows greater flexibility in domain rules

  15. Data Quality Project • Create interactive projects for data matching and cleansing • Leverage one or more domains in an existing knowledge base • Somewhat reusable

  16. Data Quality Project • Nondestructive – no changes to source of data to be cleansed • No changes to the KB either • Separately, DQS project data can be used to improve the knowledge base

  17. Data Quality Project

  18. DQS Operations • Cleansing • Process data against known entities and domain rules • Similar to Fuzzy Lookup transform in SSIS • Matching • Group data together • Similar to Fuzzy Grouping transform in SSIS

  19. DQS Administration • Monitor past activity • Set logging options • Set confidence thresholds

  20. DQS Administration

  21. DQS and SSIS • SQL Server Integration Services has integrated hook into DQS • DQS Cleansing Component • Provide automated, noninteractive data cleansingoperations

  22. DQS and SSIS

  23. Demos

  24. Shortcomings • V1 product • No API – must use DQS client interactively • SSIS component only does cleansing

  25. Final Thoughts • CU1 performance improvements • http://bit.ly/IKmMow • DQS videos/ blogs • http://technet.microsoft.com/en-us/sqlserver/hh780961 • My blog (www.TimMitchell.net) • DQS/MDS virtual chapter • masterdata.sqlpass.org

  26. Questions?

More Related