320 likes | 512 Views
SQL Server 2005 Integration Services. Dave Glover Microsoft Australia http://blogs.msdn.com/dglover. Agenda. Cleanse Data Input Split an Output Channel Script to Branch Control Lookup Fuzzy Values Loop through Folder Configure and Deploy. Overview.
E N D
SQL Server 2005Integration Services Dave Glover Microsoft Australia http://blogs.msdn.com/dglover
Agenda • Cleanse Data Input • Split an Output Channel • Script to Branch Control • Lookup Fuzzy Values • Loop through Folder • Configure and Deploy
Overview • Part of SQL 2005 Std and Ent Editions: • Integration Services with Basic Transforms. Provides graphical extract, transform, and load (ETL) capabilities • Ent only: Integration Services Advanced Transforms Includes data mining, text mining, and data cleansing • Collaborative Development • Separate Management Tools • Data Quality and profiling • Scalable
FILE HTTP WMI WMI Event Task XML Task Web Services Task Variables Log Providers Task Host Custom Task Containers Event Handlers Send Mail Task Variables Log Providers FILE … Data Flow Task ODBC … FTP Task FTP SSIS Runtime Sources Dests OLEDBConnection FTPHTTPFILE ODBCManagers CUSTOM FLATFILE OLEDB Connection FTPHTTPFILE ODBCManagers CUSTOM FLATFILE SQL Server DB WMI XML XML Flat File Flat File WEB FTP File FTP Custom Custom
Cleansing Data Input • Connection-Related Objects • Connection Manager • Flow Types • Data Adapter • Metadata Validation • Data Flow Components
Flow Types • Two different types of flow • Control Flow = Runtime = Tasks • Data Flow = Pipeline = Transforms • Managed in Designer • Used to be single view in SQL 2000 • Separate views in SQL 2005 • Control Flow handles tasks and precedence • Data Flow handles transformations - zoomed contents of Data Flow task
Source Transform Destination Data Flow Components • Components • Paths • Data route between one component and the next • Includes metadata about columns moving around • Lineage Identifier tracks item transformations • Pipeline • Components connected by a path
Splitting an Output Channel • Distributors and Collectors • Precedence Constraints
Tfm Tfm Src Dst D C Tfm Distributors and Collectors • Distributor transformations • Multicast • Conditional Split • Collector transformations • Union All • Merge • Merge Join
Precedence Constraints • Connect one task to another • Give sequential relationship to tasks • Success/Failure/Completion workflow • Establish concurrency • Connected sequence of tasks is a task list • Independent Task Lists execute concurrently • Tasks within a list execute sequentially
Demo Cleansing Data & Splitting an Output Channel
Script and Branch Control • Package Variables • Script Task • Complex Precedence • Containers
Package Variables • Scope • Each container can have variables • Define namespace for user variables • Containers can access variables from higher levels • Accessible from • Expressions (such as loops and constraints) – Use @ • Parameters in Execute SQL task • Parent Package (as part of configuration) • Script
Script Task • Currently VB .Net • Can read or modify properties throughout the package • Can’t access inner workings of tasks or transforms • Can’t modify pipeline metadata (e.g., number of columns piped)
Containers • Container provides • Grouping of task lists (list of one is allowed) • Transaction scope • Variable scope • A package is a container • Add your own tasks • Insert your own containers • Loops are containers, too
Fuzzy Lookup Transformation • Proximity algorithm to find matches • Builds index – Index can persist • Creates metrics • Similarity • Confidence • Uses a separate connection for reference table
FOR LOOP Loops while expression is TRUE Manually add loop counter Init: @N = 1 Eval: @N <= 25 Increment: @N =@N + 1 Execute tasks in container on each iteration More control—and more complex than For Each FOR EACH LOOP Loops over set of objects Files XML nodes Database objects Set variable (e.g. file name) for each iteration Execute tasks in container on each iteration Loops
Demo Script and Branch Control
Configure and Deploy • Configurations • XML Customizability • Deployment • Execution
Package Lifecycle Support • Data Visualizers • Debugging • Stop execution during package • Stop control flow before and after • Stop transformation during (visualizer) • Logging • Event Handlers • Log Providers
Configurations • Take something from the system • Environment variables, registry, XML option file • Apply it to some part of your package • Run the package with the new setting • Useful for multiple “similar” jobs • Similar to Dynamic Properties from SQL 2000 • Facilitate reusability, different environments • Configurable at runtime or during execution
Deployment • Same server deployment • Deploy from BI Development Studio • Multiple server deployment • Deployment Utility • Package configuration files with package • Deploy to SQL Server (msdb) or file (dtsx)
Execution • Command-line execution • DTEXEC • User Interface execution • DTEXECUI • Can generate command line for DTEXEC • Scheduling • SQL Server Agent
© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.
Report Builder Architecture Report Builder Client Model Designer Report Manager SQL Mgmt Studio Report Designer Web Service / URL interface Report Server Drill through report generation Query generation Model Security Data Sources (SQL Server, Analysis Services) Report Processing Query Security Rendering Delivery SQL Server Catalog
Report Consumers Business Users Power UsersDevelopers Why Report Builder? Report Viewer Report Builder Report Designer
What is Report Builder? • A new ad-hoc report design tool for SQL Server Reporting Services • Targeted at business users who want to find and share answers to interesting questions • Driven from a business model of the data so users do not need to understand the underlying data structures • Not a full analytical client or replacement for Pivot Tables • Fully integrated with Reporting Services and delivered in SQL Server 2005
Wrap-up • Report Builder is… • a new ad hoc report design tool for SQL Server Reporting Services • targeted at business users who want to find and share answers to interesting questions • driven from a business model of the data so users do not need to understand the underlying data structures • is fully integrated with SQL Server Reporting Services and delivered in SQL Server 2005
© 2003-2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. Microsoft makes no warranties, express or implied, in this summary.