380 likes | 739 Views
DBI331. Microsoft SQL Server Analysis Services Design Challenges. Peter Myers Mentor SolidQ. Presenter Introduction. Peter Myers Mentor, SolidQ BBus , MCITP ( Dev , DBA, BI), MCT, MVP
E N D
DBI331 Microsoft SQL Server Analysis Services Design Challenges Peter Myers Mentor SolidQ
Presenter Introduction • Peter Myers • Mentor, SolidQ • BBus, MCITP (Dev, DBA, BI), MCT, MVP • 14 years of experience designing and developing software solutions using Microsoft products, today specializing in Microsoft Business Intelligence • Based in San Francisco • pmyers@solidq.com
Agenda • Reviewing UDM Fundamentals • Addressing Design Challenges with: • Source Schema Design • Data Source Views • Dimensions • Cubes • Cube Calculations • Cube Enhancements • Security • Demonstrations
Session Objectives • “It covers best practice design for dimensions and cubes, and also addresses common design challenges in relation to the supporting source schema design, cube calculations and security.” • Arguably best practice “just depends” and so is open to healthy debate • Be prepared to assess your particular challenge and hypothesize, test, accept and reject a practice accordingly • Some best practices can be demoted to “tips”
Session ObjectivesContinued • This session is about: • Good modeling techniques with aim to produce a feature-rich and optimized UDM • Adding business logic with MDX • Securing the UDM • This session does not directly cover the storage layer, including partitions, storage modes, aggregation design or processing • See session DBI407: Best Practices for Building Tier 1 Enterprise Business Intelligence Solutions with Microsoft SQL Server Analysis Services, presented by Adam Jorgensen
Words about AdventureWorks • Fictitious company upon which the AdventureWorks sample databases and tutorials are based • These samples are good • Good for learning about designing and Analysis Services UDM • Great in fact, for supporting tutorials, labs and demonstrations • These samples are also bad • Bad for learning best practice design • Worse in fact, for using as a template for your UDMs
Reviewing the UDM Fundamentals Cube Dimension Measure Group Attribute Measure Attribute Relationship Measure Group Dimension Partition Hierarchy Cube Dimension Level Cube Attribute Cube Hierarchy MDX Script
Source Schema Design • Dimensional modeling techniques have evolved to become largely best practice – learn from the masters • Ongoing challenges: • Materializing calculations in fact tables • Snowflaking a dimension • “Cleaning up” with junk dimension tables • Modeling self-referencing relationships (AKA parent-child) • Sometimes there is a need to stop thinking relationally, and to start think multidimensionally • Be prepared to be creative • Be prepared to think outside the square (and inside the cube)
Data Source View • Limit the Data Source View (DSV) to a single data source • Pre-integrate multiple data sources into a single “data mart” • We recommend Integration Services • Base DSV tables on database objects • Views • Table-valued UDFs • Parameterized UDFs are a convenient and consistent way to configure partition queries
Data Source ViewContinued • If building a UDM on an OLTP schema, create views to imitate a true star schema • Preferably use an ETL process to materialize data in a true star schema • However, this approach is great for prototypes, and with good planning and care, you can retrofit a true star schema later • Define appropriate single-column keys • Define appropriate labels for member names • Consider pre-calculating, or partially calculating, values in the DSV • Name tables and columns in friendly terms
Dimensions • Optimized dimensions are a prerequisite for an optimized cube • The most important three topics for optimizing dimensions: • Attribute relationships • Attribute relationships • Attribute relationships Source: Ashvini Sharma
DimensionsAttribute Relationships • Only define relationships where they do exist • This will be the case for natural hierarchies
DimensionsAttribute Relationships – Continued • There is a world of difference between this: • And:
DimensionsAttribute Relationships – Continued • Do not define relationships for unnatural hierarchies • So these relationships are perfectly fine left as is
DimensionsAttribute Relationships – Continued • The RelationshipType property comes in two flavors: • Flexible (default) • Rigid • It should be changed to Rigid when it is understood that related values will never change • (No need to consider if you do a full database process each time) • Usually no problem if all changes are Type II • Can be a serious problem otherwise as it will require a full dimension process to rectify • The benefit of defining Rigid relationships is that incremental processing may not invalidate existing aggregations • However, watch for data entry fixes!
DimensionsAttribute Relationships – Continued • Attribute relationships have the following benefits: • Reduce the cube space to what is possible • Improve processing performance • Fewer relationships between attribute memberships need to be created • Improve query performance • Hierarchies are materialized • The query engine has hints as to how the members will be browsed • Aggregations can be created around the relationships • Allow defining member properties • Allow sorting members by a related attribute’s key or name • Ensure security is enforced as you would expect • Users can only see the cities of the states they are allowed to see
DimensionsOther Challenges • Disabling hierarchies • AKA setting the AttributeHierarchyEnabled to False • Be sure to set this for attributes that are not require for browsing; the values will become available via the member properties • Removing the All level • AKA setting IsAggretable to False • Understand the repercussions and consider setting an appropriate default member • Defining defaults • These may not be intuitive to users and could create confusion and result in the misinterpretation of query results
DimensionsOther Challenges – Continued • Despite the Best Practice Alerts (BPA) guidance, don’t get too carried away with hiding attributes • AKA setting the AttributeHierarchyVisible to False • Hidden hierarchies can be annoying for report authors trying to build filters based on an attribute hierarchy, rather than a multi-level hierarchy • Be courageous enough to challenge the BPA’s “wisdom”
DimensionsOther Challenges – Continued • Despite the Dimension Wizard’s best intentions, when creating snowflake dimensions: • Set the UnknownMemberproperty to None, and • Set the attribute KeyColumns’ Source NullProcessing property to Automatic • This way you will learn about integrity problems in your data and you can fix it there, or at least address it using logic in views • Considering using the UnknownMember feature when building a UDM on a data source with known issues, perhaps an OLTP database
Cubes • Including many subject areas (measure groups) in a single can be challenging to develop, maintain and use • Consider more databases • Consider smaller, subject specific, cubes • Perspectives are an approach to making resource-rich cubes easier to understand and use • Optimize DistinctCount measures by placing them in their own measure group, and consider partitioning the measure group by the same column the measure is based on • Format measures appropriately
CubesContinued • Understand the “Enterprise” aggregation functions and when to use them • None, ByAccount, AverageOfChildren, FirstChild, FirstNonEmpty, LastChild, LastNonEmpty • Appreciate some of these are computed at query time • Consider carefully the need to use non-regular measure group dimension relationships • Referenced, Fact, Many-to-Many • Understand the price to pay for using the different relationship types • Understand how to optimize Many-to-Many relationships
Cube Calculations • Understand the model requirements and define calculations and scoped assignments appropriately • Design the dimensions and cubes to support the required calculations • Consider pre-calculating, or partially calculating, values in the source database or DSV • Limit nested calculations • Consider “offloading” complex calculations to UDFs
Cube CalculationsContinued • Know your MDX and the capabilities of the cube’s MDX script • Know how to test the efficiency of the MDX you write • Read and apply the wisdom documented in the SQL Server 2008 Analysis Services Performance Guide • Be creative – as an example, read and learn from David Shoyer’sA Different Approach to Implementing Time Calculations in SSAS paper • BTW, the Business Intelligence Wizard’s Time Intelligence feature is designed for novice cube developers
Cube Enhancements • Actions • Great to enhance the user’s experience by making the UDM more interesting and relevant • Drillthrough actions are limited to the data available in the cube • Consider using Reporting Services reports instead of lowering the grain of a measure group just to support drillthrough data • Perspectives • Are helpful to focus the user on a specific subject-area • Can be helpful to limit exploration in PPS analytic reports too • Keep in mind these are not designed to enforce permission sets • Use database roles to enforce permissions
Security • Define clearly named roles to describe purpose • Use the role’s Description property to provide further details to help review its purpose • Avoid cell level security which can negatively impact on performance • Consider whether the allowing/denying dimension members can achieve the same outcome
SecurityContinued • Analysis Services only support Windows Authentication • Challenging for: • Non-Windows clients • “Double-hop” scenarios where the requesting user’s identity must be passed to the database • Configure Kerberos • Configure dynamic security expressions
SecurityContinued • Configure dynamic security expressions by using the UserName() and CustomData() functions • UserName() returns the DomainName\UserName of the current connection • CustomData() returns the value of the CustomData connection string property, if defined • Pass these functions into: • MDX functions and statements • The CALL statement to run a stored procedure • Requires the development and deployment of an assembly
Demonstrations Peter Myers Mentor SolidQ demo
Required Slide Speakers, please list the Breakout Sessions, Interactive Discussions, Labs, Demo Stations and Certification Exam that relate to your session. Also indicate when they can find you staffing in the TLC. Related Content • Breakout Sessions • DBI407 | Best Practices for Building Tier 1 Enterprise Business Intelligence Solutions with Microsoft SQL Server Analysis Services (Tue, 17 May, 8:30AM) – yesterday! So if you missed it, watch the recording.
Required Slide Speakers, please list the Breakout Sessions, Interactive Discussions, Labs, Demo Stations and Certification Exam that relate to your session. Also indicate when they can find you staffing in the TLC. Related Content • Recommened Books • The Data Warehouse Toolkit: The Complete Guide to Dimensional Modeling (Second Edition) (Wiley), by Ralph Kimball and Margy Ross • Expert Cube Development with Microsoft SQL Server 2008 Analysis Services (PACKT Publishing), by Marco Russo, Alberto Ferrari and Chris Webb • Microsoft SQL Server 2008 Analysis Services Unleashed (SAMS), by Irina Gorbach, Alexander Berger and Edward Melomed
Required Slide Speakers, please list the Breakout Sessions, Interactive Discussions, Labs, Demo Stations and Certification Exam that relate to your session. Also indicate when they can find you staffing in the TLC. Related Content • SQL Server Articles (search http://microsoft.com) • SQL Server 2008 Analysis Services Performance Guide • SQL Server 2005 Analysis Services Distinct Count Optimization • SQL Server 2005 Analysis Services Many-to-Many Dimensions: Query Performance Optimization Techniques • Identifying and Resolving MDX Query Performance Bottlenecks in SQL Server 2005 Analysis Services
Required Slide Speakers, please list the Breakout Sessions, Interactive Discussions, Labs, Demo Stations and Certification Exam that relate to your session. Also indicate when they can find you staffing in the TLC. Related Content • Other Resources • BIDS Helper: http://bidshelper.codeplex.com • A Different Approach to Implementing Time Calculations in SSAS: http://www.ssas-info.com/analysis-services-articles/59-time-dimension/347-a-different-approach-to-implementing-time-calculations-in-ssas-by-david-shroyer
What’s Next? • Follow, Tweet and Enter to win an Xbox Kinect Bundle • GAME ON! Join us at the top of every hour at the BI booth to compete in the Crescent Puzzle Challenge and Win Prizes • Sign up to be notified when the next CTP is available at: microsoft.com/sqlserver Join the Conversation @MicrosoftBI /MicrosoftBI
Resources • Connect. Share. Discuss. http://northamerica.msteched.com Learning • Sessions On-Demand & Community • Microsoft Certification & Training Resources www.microsoft.com/teched www.microsoft.com/learning • Resources for IT Professionals • Resources for Developers http://microsoft.com/technet http://microsoft.com/msdn
© 2011 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.