1 / 7

Improve SSIS delta loads using hashing techniques

Improve SSIS delta loads using hashing techniques. How to spot the unique. Find the Fault!. What is a hash function. A hash function is any algorithm that maps data of arbitrary length to data of a fixed length

zan
Download Presentation

Improve SSIS delta loads using hashing techniques

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improve SSIS delta loads using hashing techniques How to spot the unique

  2. Find the Fault!

  3. What is a hash function • A hash function is any algorithm that maps data of arbitrary length to data of a fixed length • IE: SELECT Hashbytes( 'MD5','The quick brown fox jumps over the lazy dog') = 0x9E107D9D372BB6826BD81D3542A419D6 (16 Bytes)

  4. Why is Hashing important in SSIS? • Identification of uniqueness in data that has no keys • Identification of changes in large string data • Ability to minimise the buffer usage in lookup transformations • Can be applied against any data source

  5. Known problems and obstacles • Hash Collisions • Additional development competency • Additional evaluation of data sources

  6. Can we make it smaller? • Conversion to BIGINT • SELECT Convert(BIGINT,Hashbytes( 'MD5','The quick brown fox jumps over the lazy dog')) = 7770993271616313814 (8 Bytes)

  7. Ok so what is it good for?

More Related