430 likes | 564 Views
SQL Server 2008 BI-sovelluskehitys uudet ominaisuudet Pekka Korhonen Seniorikonsultti FC Sovelto Oyj. Business Intelligence. BI is used to… Understand the health of the organization Collaborate on a shared view of business drivers Reduce the time to decision Its goal is often to…
E N D
SQL Server 2008 BI-sovelluskehitys uudet ominaisuudet Pekka Korhonen Seniorikonsultti FC Sovelto Oyj
Business Intelligence • BI is used to… • Understand the health of the organization • Collaborate on a shared view of business drivers • Reduce the time to decision • Its goal is often to… • Impact the bottom line by measuring specific operations • Enhance competitive advantage BI is no longer a luxury afforded by a few large companies—it is now considered an essential part of the IT portfolio
HR Inventory Finance Source Systems • Process real-time transactions • Contain data structures optimized for modifications • Normalized schema • Minimal indexing strategy • Usually provide limited decision support • Are commonly referred to as: • Online transaction processing (OLTP) systems • Operational systems
Call Center Web Apps CRM Finance Inventory Data Warehouse HR ERP Silos of Data
Data Warehouse Characteristics • Data warehouse systems… • Present data for business analysis processes • Commonly store data in subject-specific stores called data marts • Contain structures optimized for rapid ad hoc information retrieval • Combine valid source data • Integrate data from heterogeneous source systems • Provide a consistent historical data store
ETL Extract, Transform, and Load • Extract data from the source systems • Transform the data to convert it to a desired state • Load the data into the data warehouse
Sales ProductAssociation Finance Analytical Systems • Multidimensional databases are also called online analytical processing (OLAP) databases and… • Contain structures optimized for rapid ad hoc information retrieval • Pre-calculate and store aggregated values • Include calculation engines for fast, flexible transformation of base data • Are designed to reveal business trends and statistics not directly visible in the data retrieved from a data warehouse • Data mining models discover patterns in data, typically for prediction analysis
Client Access • Client access and distribution mechanisms can include: • Static report viewers and browsers • Ad hoc query tools • Report writers • Modeling applications • Scorecard applications • Portals and dashboards • Delivering data is a process of continuous business improvement: • Monitor • Analyze • Plan What happened?What is happening? Why? What will happen?What do I want to happen?
Integrated Reporting and Analytics Data Marts Staging Area Client Access Manual Cleansing 9: Delivering BI enables a process of continuous business improvement 1: Clients need access to data 7: Manual cleansing may be required to cleanse dirty data 6: Staging areas may simplify the data warehouse population 5: Data warehouse is periodically populated from data sources 3: Data sources can be mirrored/replicated to reduce contention 2: Clients may access data sources directly 8: Clients use various tools to query the data warehouse 4: The data warehouse manages data for analyzing and reporting Data Warehouse Data Sources Client Access
Performance Management Integrated BI Solution Data Platform Information Worker Platform • Powerful • Cost Effective • Fast Time-to-Market • Choice of Integration Points Microsoft BI Platform • Enterprise Grade • Pervasive • Integrated • Flexible • Full Featured • Interoperable • Extensible
Integrate Report Analyze SQL Server 2008 BI Platform • Data acquisition from source systems and integration • Data transformation and synthesis • Data enrichment, with business logic, hierarchical views • Data discovery via data mining • Data presentation and distribution • Data access for the masses
SSIS - Script Task Editor • Choose script language when adding task or component • Use Edit Script button on first page for easy access • Select ReadOnlyVariables and ReadWriteVariables from list
Web Service Script Design Step 1 • Add a Web Reference to the project • Provide the URL to the ASMX or WSDL file of the service
Web Service Script Design Step 2 • v Include the full name of the object in the Using directive
Web Service Script Design Step 3 • Instantiate an object for the Web service and its methods • Compile code
Improving Package Performance • Persistent Lookups • Benefits of Persistent Lookups • Lookup Cache Types • Cache Connection Manager • Cache Transform • Lookup Configuration • Pipeline Scalability • Benefits of Pipeline Scalability • Thread Scheduler
Lookup Cache Types No Cache • Reference dataset uses OLE DB connection • Lookup executes one query for each row in the pipeline Reference Dataset Pipeline Lookup No match handler: Ignore Redirect to error output Fail component Redirect to no match output Match? N Y Pipeline
Lookup Cache Types Partial Cache • Reference dataset uses OLE DB connection • Lookup searches cache first, then executes non-cache query if index columns not found in cache Reference Dataset Pipeline Hit/Miss Cache Lookup No match handler: Ignore Redirect to error output Fail component Redirect to no match output Match? N Y Pipeline
Lookup Cache Types Full Cache In-Memory • Reference dataset uses OLE DB connection • Cache loads into memory during PreExecute phase and remains static throughout package execution Reference Dataset Pipeline PreExecute Cache Lookup No match handler: Ignore Redirect to error output Fail component Redirect to no match output Match? N Y Pipeline
Data Profiling Task • Profiles tables for exploring or preserving data quality • Run as a task in SSIS • Produces XML file output • Has a nice visual tool for working with profiles • Analyzes a set of columns / tables • Candidate keys • Functional dependencies • Value inclusion • Analyzes a single column • Column length distribution • Null Ratio • Pattern detection • Statistics • Value distributions
Set-Based Profile Types Functional Dependency • Determine whether the dependent column depends on the values in the determinant column(s) • Find invalid values, such as incorrect combinations of US zip codes and US states
Column-Based Profile Types Column Length Distribution • Calculate distinct lengths of string values and percentage of rows each length represents • Find invalid values, such as values shorter or longer than expected
Column-Based Profile Types Column Null Ratio • Calculate percentage of null values • Find unexpectedly high ratio of null values, such as a column with high percentage of missing codes
Column-Based Profile Types Column Value Distribution • Calculate distinct values and percentage of rows for each value • Find anomalous distinct values
Drilldown to details Browse profiles by column
Change Data Capture • Table-level implementation to track changes in a relational structure • Change data stored in tables • Details about inserts, updates, and deletes • Log sequence number (LSN) for the commit transaction • Begin and end time of each LSN • Stored procedures and functions available to query for configuration or change data details • Alternative approach to managing slowly changing dimension scenarios
Change Data Capture Configuration • Enabling a database for CDC • EXECUTE sys.sp_cdc_enable_db_change_data_capture; • Enabling a table for CDC • EXECUTEsys.sp_cdc_enable_table_change_data_capture@source_schema = 'Person', @source_name = 'Contact' , @role_name = 'cdc_admin' , @filegroup_name = ‘CDC'
Office-style Ribbon Data Pane Properties Pane Report Designer Layout Globals Parameters Expression Placeholders Data Source & Dataset Group Task Pane
Introducing Tablix table + Matr ix
Introducing Tablix • Flexible grid layout • Fixed columns and dynamic rows like a Table • Dynamic rows and columns like a Matrix • Any combination of Table and Matrix layouts • Flexible grouping • Nested groups • Adjacent groups • Recursive groups
Reviewing Tablix Examples Hierarchical rows with dynamic headers Desired Current
Reviewing Tablix Examples Mixing dynamic and static columns Current Desired
Reviewing Tablix Examples Parallel Dynamic Groups Current Desired
Smart Tags Working with the Chart Layout Edit and format chart title here Calculated Series Fly-out menu
Using New Chart Features • Secondary Axes • Scale Break • Multiple chart areas with optional alignment
Introducing Gauges and Indicators • Display and monitor real-time data • Use as dashboard or scorecard components for visualizing KPIs
Improved ease of use • Decreased time to develop solutions • Embedded best practices and performance tuning tips into object model and user interface • Redesigned interface to ensure the natural outcome is optimal design Design Improvements in Analysis Services 2008
Cube Wizard • Supports building cube from one table • Produces simpler output • Provides safer error configuration settings • Dimension Wizard • Creates parent-child attributes automatically • Enables assignment of attribute type • Supports classification of member properties • Provides safer error configuration settings Cube and Dimension Wizard Improvements
Visual indicators to highlight best practice violations • Ability to dismiss warnings by instance or globally with optional comment Best Practices Warnings
Graphical editorfor attribute relationships • Visualization ofrigid and flexiblerelationships Attribute Relationship Designer
Ability to review and modify aggregation usage settings • Name the aggregation design Aggregation Design Wizard Improvements
View aggregation designs and aggregations • Add, change, or delete aggregations manually • Assign an aggregation design to another partition Aggregation Designer