290 likes | 386 Views
Platinum. Learn & Enjoy [Put your phone on Vibrate!]. www.sqlbits.com. Group BY: [Remember get your Tickets for Entry, Coach, Drink]. Gold. Feedback Forms: [Fill these out at the end of each session please]. Silver.
E N D
Platinum Learn & Enjoy [Put your phone on Vibrate!] www.sqlbits.com Group BY: [Remember get your Tickets for Entry, Coach, Drink] Gold Feedback Forms: [Fill these out at the end of each session please] Silver Lunch Time Sessions: [Once you eat, learn some more; Grok Talks in Chicago 1 and 2 Quest are in Menphis, Idera are in Everest] Ask The Experts[Sessions need to finish on time, take questions to the ATE area]
Performance Tuning MDX Queries Chris Webb Crossjoin Consulting Limited / Solid Quality Mentors chris@crossjoin.co.uk
Agenda • Performance tuning methodology • Tuning the cube • Tuning MDX in queries and calculations • Cheating: warming the cache
Performance Tuning Methodology • Before you start your work, ensure that: • Your test environment is comparable with your production environment • Nothing else is happening on the server • No-one will connect to the server and try to run queries • You know what exactly you are trying to tune!
Performance Tuning Methodology • Capture sample queries using Profiler • Look at the Query End events for the MDX used • Look at the Duration for how long the query took to run in ms • Note that for some clients (eg Excel before 2007) queries may be dependent on named sets declared earlier in the session
Performance Tuning Methodology • Clear the cache using an XMLA ClearCache command before running your query for the first time • Then run the query again immediately afterwards, so you have timings on a cold and warm cache • Disk caching may also have a noticeable impact if you are querying a newly-processed cube
Hardware • Throwing hardware at the problem is rarely the answer, unless: • You have problems with performance degredation with large numbers of concurrent users • You have a lot of data • Extra memory will mean you have more room for caching query results • Disk IO should be as fast as you can afford • Extra CPUs can help when you have queries that cross multiple partitions, or many concurrent users
Attribute Relationships • Make sure attribute relationships in your dimension are set up to reflect all 1:M relationships in your data • Failure to do so will mean that you will not get maximum benefit from aggregations and partitioning, and the engine itself will be less efficient
Tuning the Storage Engine • The percentage of time spent in the Storage Engine can be worked out by: • Summing the Duration of all Query Subcube events seen in Profiler • Dividing that by the Duration of the Query End event • If you see a Query Subcube event take more than 0.5 seconds it should probably be tuned
Partitioning • You should partition your measure groups to reflect the slices that users are likely to use in their queries • Partitioning improves query performance in two ways: • It reduces overall IO because AS should only scan the partitions containing the data requested • It increases parallelism because AS can scan more than one partition simultaneously
Partitioning • You should be partitioning so that a single partition contains between 2 million rows and 15 million rows of data • Although AS should auto-detect the slice of a partition, it is good practice to set the Slice property manually, even on MOLAP partitions • In Profiler, Progress Report Begin/End events show which partitions are being read
Aggregations • Building aggregations improves query performance by reducing the amount of calculation done at query time • Aggregations start to show an effect on regular Sum measures where partition size is greater than a few million rows • Can have an effect on much smaller partitions where there are complex calculations
Aggregation Design Methodology • Make your dimension design as ‘clean’ as possible – delete or disable unnecessary attributes • Set the AggregationUsage property appropriately on each Cube dimension • Run the Aggregation Design Wizard • First using the ‘I Click Stop’ option • Then, if this creates too many aggregations, using ‘Performance Gain’ of 30%
Aggregation Design Methodology • Don’t bother with Usage-Based Optimisation: it may overwrite existing useful aggregations • If necessary, design aggregations manually using the version of Aggregation Manager built-in to BIDS Helper • Check aggregations are being used by looking for the Get Data From Aggregation event in Profiler
The Formula Engine • If Storage Engine requests make up a negligible percentage of the time taken by the query, then the Formula Engine is the problem • Another giveaway sign is that when a query runs on a multi-processor server, only one CPU shows activity • The Formula Engine only uses one thread per query • MDX Script Performance Analyser can help to work out which, if any, calculation is the problem
Using Named Sets • Use named sets instead of repeatedly evaluating the same set expression • Named sets can be defined • On the cube and in the session, but since they are static (until AS2008) they are of limited usefulness • In the WITH clause, which is useful only if you have control over the MDX you’re executing • Inline, which can be very powerful but also a bit buggy
Non Empty Filtering • Removing empty values before a set operation, such as a Sum(), is often important • This relies on the fact that in many cases, AS can remove empty values very quickly • Do not use NonEmptyCrossjoin() any more, use the NonEmpty() function instead: it is more reliable, predictable, and just as fast
Non_Empty_Behavior • Setting the Non_Empty_Behavior property on calculated members can provide a massive performance increase • However, knowing how to set it correctly can be very difficult:“I feel that it is the best not to mention NON_EMPTY_BEHAVIOR at all - unless you can explain what it does - and (no offence) I don't think you will be able to do it.”, Mosha
Non_Empty_Behavior • Non_Empty_Behavior has improves query performance in two ways: • When filtering out empty tuples from a set using NON EMPTY or NonEmpty() • When working out how to request the data a calculation needs from the Storage Engine, even when there is no filtering taking place • Increasingly, the Formula Engine can ‘guess’ what it should be set to though
Non_Empty_Behavior • In general, the rule for setting it is if you can say: <subcube> = iif(IsEmpty(<empty expression>) , null, <else expression>); Then you can say: Non_Empty_Behavior(<subcube>)= <empty expression>;
Appropriate Calculation Scope • Use scoped assignments to determine when and where calculations execute, rather than conditional logic • For example, a year-to-date calculation will never be relevant at the year level • Do not use IIF with Hierarchy.Currentmember.Level – scope on the level itself
Aggregation-Aware Calculations • If you know where you have built your aggregations, it can make sense to write your calculations appropriately • For example, if you have an aggregation at the year level, the calculation: Q1 + Q2 + Q3 might be better expressed as: Year – Q4
MDX to Avoid • ORDER – there’s a bug which makes it perform badly! Use TOPCOUNT instead. • LOOKUPCUBE – redesign your cube instead • CASE – it isn’t optimised, use IIF instead • Non-deterministic functions like USERNAME, STRTOX – they stop AS being able to cache calculated values • Late-binding functions like LINKMEMBER – they stop AS being able to generate efficient query plans
Warming the Cache • Analysis Services can cache Storage Engine data in most circumstances • It can also cache calculation results in many cases too • As a result, most queries execute much faster (if not immediately) on a warm cache • So why not run a sample of common queries every night after cube processing?
How to Warm the Cache • The Create Cache MDX statement (available in AS2K and AS2005 post SP2) is useful for loading the Storage Engine cache • Several tools can be used to automate running of queries: • Ascmd, available in the SQL Server samples • SSIS, using the Execute SQL task
Resources • “Analysis Services 2005 Performance Guide” white paper • “Identifying and Resolving MDX Query Performance Bottlenecks” white paper • Mosha’sblog: http://sqlblog.com/blogs/mosha • My blog:http://cwebbbi.spaces.live.com/
Platinum www.SQLBits.com[Conference Web site] www.sqlbits.com www.SQLBlogCasts.com[Becoming the premier Blogging site for SQL professionals] Gold www.SQLServerFAQ.com[UK SQL Server Community Website] Silver UK SQL Bloggers cwebbbi.spaces.live.comsqlblogcasts.com/blogs/simonssqlblogcasts.com/blogs/tonyrogerson Remember :: Feedback Forms!!