250 likes | 431 Views
Embarquez les services d'intégration SQL Server 2005. Romelard Fabrice. D311. Presentation. MVP ASP.NET since 2003 French communities member GUSS ASP-PHP Codes Sources .NET developer, SQL Server DBA and Sys. Admin Coworkers Ilem SA at Geneva (Suisse). Survey. SQL Server ?
E N D
Embarquez les services d'intégration SQL Server 2005 Romelard Fabrice D311
Presentation • MVP ASP.NET since 2003 • French communities member • GUSS • ASP-PHP • Codes Sources • .NET developer, SQL Server DBA and Sys. Admin • Coworkers Ilem SA at Geneva (Suisse)
Survey • SQL Server ? • SQL Server Integration Services ? • .NET Developer ?
Prerequisites • Knowledge of Integration Services • Knowledge of Data Flow Functionality Level 300
Objectives • Introduction to SSIS programming model • Learn how to integrate with dynamic metadata • Learn how to utilize data cleansing functionality in your apps
Package Tasks Precedence Constraints Connection Managers Containers Data Flow Task Components – Source Adapters, Transformations, Destination Adapters Paths SSIS Terminology
Application Overview • Get data from an Excel file • Provide fuzzy cleansing for certain text fields • FirstName, LastName • Save cleaned data in another Excel file • Look at finished application first, then go through several iterations to build it
SSIS is embeddable • SQL Server uses SSIS • SMO • Maintenance Plans • Other (non SQL) products in development are using SSIS • Writing your own UI is possible • SSIS designer, Management Studio, Import/Export Wizard, Migration Wizard • Uses nosecret APIs • Enumerating/adding/removing/changing/listening/scheduling/… • Considering releasing Migration Wizard in Shared Source • Digital signing enables tamper resistance • Several customers doing metadata driven package development
Pipeline Metadata • Pipeline engine requires static metadata • Early design decision • Buffers laid out during pre execute • Strict data types • Cannot map columns during execution • Designer debugging expects design time metadata at execution time • Configured (dynamic) queries must resolve to design time metadata at runtime
Dynamic Metadata • Scenarios • Source schema changes/not known until execution • Metadata driven ETL processes • Handling dynamic metadata • Generate data flows dynamically
Creating Packages • From scratch through object model • Create all package elements from scratch • Fast, small, efficient • Harder to evolve the application • From template package • Adjust only what needs adjusting after loading the template package • Need to embed potentially large template file • Easier to evolve the application • Digital signing detects user changes
Components Terminology • Component • Input • Input Columns (Only data referenced by component) • Virtual Input Columns (All available data produced by upstream components – used at design time for selecting input columns) • External Metadata Columns (Schema snapshot) • Output • Output Columns (Produced data) • External Metadata Columns (Schema snapshot) • LineageID uniquely identifies a column • Every output column gets a new Lineage ID • Column Mapping • Sources: ExternalColumn<->OutputColumn • Transforms: InputColumn<->OutputColumn • Destinations: InputColumn<->ExternalColumn
Inputs Component Outputs RCC ComponentMetaData Pipeline Programming Model • ComponentMetadata • Provided for all components by the engine automatically • Manages metadata and persistence for the component • Contact information for unregistered components • Helps delay creation of components until necessary • Runtime Connection Collection • Connection managers used by the component
SSIS As A Source • ETL processes typically encode complex business rules • Reuse is important • One version of the truth • Updates in one place • Leverage advantages of SSIS: scalability, manageability, visual building of complex processes, etc.
Implements IDbConnection ConnectionString is the command line args to dtexec.exe Command CommandText is the name of the DataReaderDest component in package ExecuteReader runs the package when asked for data, returns IDataReader Supports SchemaOnly also DataReaderDest implements IDataReader Gets the first buffer and waits for data request SSIS Source Implementation Microsoft.SqlServer.Dts.DtsClient Data Reader Destination Component
Summary • Programming SSIS is straightforward • Several embedding options exist • SSIS can handle flexible metadata • SSIS provides rich functionality and high performance
Resources • Samples installed by setup • Community site, run by MVPs • http://www.sqlis.com • http://www.guss.fr • Interact with product team on MSDN Forums • http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=80 • Webcasts, training, blog links, books, … • http://msdn.microsoft.com/SQL/sqlwarehouse/SSIS/default.aspx • NewsGroups SQL Server
Your benefits: • exclusive, special informationon Windows Vista and the 2007 Microsoft Office release • notifications upon availability of new beta versionsof to download or order them as soon as possible • product-specific bonus material(videos, samples, webcasts, gadgets, etc.)