1.39k likes | 1.63k Views
A Survival Guide Tim Mitchell. Real-World SSIS. What we’ll cover today. Lessons I’ve learned the hard way Methodologies to solve real problems in SSIS Tools to help out Solutions for SQL 2012 as well as earlier versions Demos. What we won’t cover. No intro to SSIS Books Online.
E N D
A Survival Guide Tim Mitchell Real-World SSIS
What we’ll cover today • Lessons I’ve learned the hard way • Methodologies to solve real problems in SSIS • Tools to help out • Solutions for SQL 2012 as well as earlier versions • Demos
What we won’t cover • No intro to SSIS • Books Online
Housekeeping • Presentation materials • Lunch / breaks
Housekeeping • Let’s keep it informal • Ask questions • Ask questions • Ask questions
PSA: Community • Survival is easier in groups • Local user groups • Events (SQL Saturday, SQL Bits, PASS Summit) • Online communities • Twitter (#sqlhelp)
About me • Business intelligence consultant • Group Principal, Linchpin People • SQL Server MVP • TimMitchell.net / @Tim_Mitchell
Texas Dictionary • Whole mess: Bountiful amounts of something, usually referring to excess • More than one way to skin a cat: A pet-unfriendly phrase to indicate that there are usually multiple ways to solve the same problem • Ya’ll: A subgroup of the current group • All ya’ll: The whole of the current group
survival (noun) The state or fact of continuing to live or exist, typically in spite of an accident, ordeal, or difficult circumstances. Reference: Dictionary.com (http://dictionary.reference.com/browse/survival)
survival (noun) Survival is simply the state of existing. It’s just a small step above being dead. -- Me Photo credit: Elvis Ripley (http://www.flickr.com/photos/elvisripley/315439191/). Used under Creative Commons license.
Elements of Survival The dangers: • The elements • Predators • Foolishness of fellow survivors • The unexpected
Elements of Survival The dangers: • Dirty data • Complex or poorly defined ETL requirements • Unexpected metadata changes • Unstable sources/destinations • Project managers
Elements of Survival Means of survival: • Common sense of self preservation • Tools • Leaning on others • Learning from others’ mistakes
Elements of Survival Means of survival: • Best practices • Consistency • Document • Tools (buy/build) • Community
Plan to Fail Survival Tip #1:
Planning to Fail Data failures: • Missing or offline sources • Changed metadata • Partial loads • Validation issues • Unexpected domain values
If it happens… When X
Planning to Fail Planning for failure in the wild: • Build your shelter before it rains • Layers • Leaves • Bread crumbs
Planning to Fail Planning for failure, the ETL way: • Be a pessimist! • Fail gracefully • Capture error/warning data on failure • Build for restartability (where appropriate)
Planning to Fail Why graceful failure? • Avoid leaving affected systems in an inconsistent state • Avoid repeating wholesale operations • Timely notifications to allow proper response from dev/admin staff
Planning to Fail Graceful failures in SSIS • Control flow: • Event handlers • Precedence constraints • Data flow: • Error row redirection • Lookup failure redirection • Conditional split
Planning to Fail Graceful failures in SSIS • Restartability • SSIS Checkpoints • SSIS transactions • Both methods have shortcomings • Custom restartability can be an option
Planning to Fail Natural failures • Simply stop processing on error • Default behavior • In some cases, can be the right pattern
Demo Designing for failure
Take Notes Survival Tip #2:
Take Notes What to note? • Trails, paths, and shortcuts • Water sources • Hazards • Enemy positions • Weather and wildlife patterns • Sunrise/sunset time http://www.flickr.com/photos/defenceimages/4993131844
Take Notes What to note? • Success and failure of operations • Row counts • Run times • Validation information • Warnings http://www.flickr.com/photos/mezdeathhead/3057797092/
Take Notes Why? • Know what to expect • Plan for growth • Cover your assets http://www.flickr.com/photos/mezdeathhead/3057797092/
Take Notes It’s all about the log. • SSIS logging • SQL Server log • Custom logging http://www.flickr.com/photos/rwphoto/3108504086/
Take Notes SSIS Package Logging • It’s already there • Easy to start • Flexible events and destinations • Can be unwieldy http://www.flickr.com/photos/raver_mikey/4285637209/
Take Notes SSIS Catalog Logging • Version 2012 only • Easiest to configure • Design time or runtime • Least flexible http://www.flickr.com/photos/raver_mikey/4285637209/
Take Notes Custom Logging • Roll your own • Most difficult to set up • Infinitely flexible http://www.flickr.com/photos/raver_mikey/4285637209/
Take Notes Server/engine logging • SQL Engine error log • DMVs • Third party tools • Windows log • PerfMon http://www.flickr.com/photos/raver_mikey/4285637209/
Demo Take Notes
Perform atyour best Survival Tip #3:
Perform at your Best http://www.flickr.com/photos/defenceimages/5786942640/ http://www.flickr.com/photos/defenceimages/4627096866/
Perform at your Best Soldier up! • Recognize and avoid quicksand • React appropriately when you’re stuck • Know your environment
Perform at your Best Soldier up! • Isolate and eliminate the things that slow you down • Recognize design patterns that are detrimental to performance • Look *outside* SSIS (gasp!)
Perform at your Best It’s not just SSIS • The majority of SSIS performance problems have nothing to do with SSIS • Limitations on sources and destinations
Perform at your Best It’s not just SSIS • Don’t just ‘pass the buck’, but do consider other factors: • SQL engine configuration • Disk configuration • Network speed/latency • Physical machine capabilities
Perform at your Best It’s not just SSIS • Proper query techniques for relational sources • Effective indexing for sources and destinations • Using OPTION (FAST <n>)
Perform at your Best Streamline your data flows • Transformations matter! • Know how the blocking properties of transformations
Perform at your Best Streamline your data flows • Nonblocking transforms do not hold buffers • Derived Column • Conditional Split • Row Count
Perform at your Best Streamline your data flows • Partially blocking transforms will queue up buffers as needed • Merge Join • Lookup • Union All
Perform at your Best Streamline your data flows • Fully blocking transforms will not pass any data through until all of the data has been buffered at that transformation • Sort • Aggregate
Perform at your Best Streamline your data flows • Be aware of memory use! • LOB (large object) columns will always spool to disk rather than staying in memory. • [N]VARCHAR(MAX) • Memory buffers may spill over to disk
Perform at your Best Streamline your data flows • Manage your sources • Don’t use table drop down list – specify your query including only the necessary columns • Be mindful of indexes when writing data retrieval queries