410 likes | 584 Views
Implementing Common Business Calculations in DAX. Chris Webb. chris@crossjoin.co.uk www.crossjoin.co.uk. Who am I?. Chris Webb ( chris@crossjoin.co.uk ) Independent consultant specialising in Analysis Services and MDX (and PowerPivot and DAX!): http://www.crossjoin.co.uk
E N D
Implementing Common Business Calculations in DAX Chris Webb chris@crossjoin.co.uk www.crossjoin.co.uk
Who am I? • Chris Webb (chris@crossjoin.co.uk) • Independent consultant specialising in Analysis Services and MDX (and PowerPivot and DAX!): http://www.crossjoin.co.uk • Blogger: http://cwebbbi.spaces.live.com • Author: • MDX Solutions 2nd Edition • Expert Cube Development With Analysis Services 2008
Agenda • What is DAX and why should I learn it? • Calculated columns and calculated measures • Row context and filter context • Calculate() • Values() • Demos, demos, demos!
What is DAX? • DAX is the new calculation language for PowerPivot • MDX is still the query language • It is a multidimensional calculation language like MDX, but the design goals were: • Make it easy to do common calculations (easier than MDX) • Make it easy to use for Excel power users, hence the Excel-based syntax • DAX expressions are limited to a single (though often very long) line of code
Why learn DAX? • Your power users will be learning it, and you’ll need to understand what’s going on in their PowerPivot models • It’s very likely that it will be used in future versions of SSAS – possibly as an alternative to MDX for some calculations • We love both DAX and MDX, and MDX is not going away. The challenge for us is to bring them together. We have some promising directions. That said – I think you’ll see DAX evolving in new important directions that MDX will never cover, and a large and growing portion of the calc work will be done in DAX. So my advice stands: all you guys need to become DAX gurus ASAP.Amir Netz, Microsoft
DAX Syntax and Functions • DAX syntax is based on Excel syntax • Only thing to watch out for is you need to use && and || for AND and OR • Supports around 80 Excel functions • Uses its own data types • In Excel you only have numbers or strings • Uses the concept of BLANK rather than NULL • PowerPivot BLANKs behave in exactly the same way as SSAS nulls • The UI for editing DAX is terrible – can use NotePad++ and Colin Banfield’s DAX language template instead
Referencing Columns and Measures • DAX usually requires fully qualified names, eg: • MyTable[MyColumn] • MyTable[MyMeasure] • You can use unqualified names in calculated column definitions, when referring to other columns in the same table • DAX cannot use standard Excel cell references to data elsewhere in the workbook.
Calculated Columns • DAX can be used to create two types of calculation: calculated columns and calculated measures • Calculated columns are derived columns in a PowerPivot table • Are defined inside the PowerPivot UI • After they are created, they behave like any other column • Their values are calculated immediately after the real data has been loaded into PowerPivot • Can be used to do basic ETL work, eg concatenating first and last names, deriving years from dates etc
Calculated Measures • Calculated measures provide the numeric values you see aggregated in pivot tables • All measures are calculated measures in PowerPivot! • Are defined within the pivot table in Excel • Are calculated at query time • Basic sum, count, min, max and average calculated measures can be created from the right-click menu • More advanced calculated measures can be created by entering your own DAX expression
Row Context and Filter Context • Row context refers to the current row where a calculation is taking place • There is no row context without a DAX aggregation function • Filter context refers to the currently selected item on each column on each table • Very similar to the CurrentMember function in MDX, but handles multiselect elegantly • Filter context follows the one-to-many relationships between tables • Both row and filter context must be taken into account when writing expressions for calculated measures
Calculate() • The Calculate() function is the key to all advanced DAX calculation functionality • Signature: Calculate(Expression, SetFilter1, SetFilter2,...) • It allows any expression to be evaluated in a specific filter context • It works as follows: • Modifies the current filter context according to the SetFilter arguments you pass in • Shifts the row context onto the filter context • Evaluates your expression in the new filter context
Calculate() • SetFilter arguments can either be: • Boolean expressions, egCALCULATE(COUNTROWS(), Consultants[Skill]="MDX") • Table functions, so the filter context for any table is set to the rows returned by the table function, egCALCULATE(COUNTROWS(), FILTER(Consultants, Consultants[Measure1]>3)) • Useful table functions include: • Values(), which returns a list of distinct values in a column in the current filter context • All(), which returns a list of all the values in a column ignoring filter context
Ratio to All and Ratio to Parent • A common example of a ‘ratio to all’ and ‘ratio to parent’ calculations is a market shares • The numerator for this kind of calculation is easy:SUM('Consultants'[Measure1]) • The denominator is the ‘all’ or ‘parent’ total • To get the total value of a measure with all filter context cleared we need to use All(Table), eg:CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants)) • To get the total value of a measure with filter context for just one column removed, we need All(Column), eg:CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants[Consultant]))
Ratio to All and Ratio to Parent • This makes the final calculations: =SUM('Consultants'[Measure1])/CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants)) and =SUM(Consultants[Measure1])/CALCULATE(SUM('Consultants'[Measure1]), ALL(Consultants[Consultant]))
Previous Year Growth • Basic algorithm to calculate year-on-year growth is: ((Sales for Current Time Period) - (Sales for Same Time Period in Previous Year)) / (Sales for Same Time Period in Previous Year) • The big problem is how to calculate sales for the same time period in the previous year in the absence of SSAS-like hierarchies?
Previous Year Growth • The following is the best approach: • =CALCULATE(SUM(FactInternetSales[SalesAmount]), DATEADD(DimDate[FullDateAlternateKey], -1, YEAR), ALL(DimDate)) • Works by: • Finding all the dates in the current year • Shifting each date back one year • Setting this as the filter context • Summing SalesAmount for these dates • BUT will only give correct results if you have a complete set of dates in your year!
Previous Year Growth • Always a good idea in DAX to break complex calculations up into a series of simpler calculated measures • So, if we use the previous formula to define a calculated measure called Previous Year Sales, the growth calculation becomes: • =IF(FactInternetSales[Previous Year Sales]=0, BLANK() • ,(SUM(FactInternetSales[SalesAmount]) - FactInternetSales[Previous Year Sales]) / FactInternetSales[Previous Year Sales]) • Notice, like in MDX, we need to trap division by zero
Rules for Time Intelligence functions • Five rules for using Time Intelligence functions: • Never use the datetime column from the fact table in time functions. • Always create a separate Time table, and make sure it contains complete years • Create relationships between fact tables and the Time table. • Make sure that relationships are based on a datetime column (and NOT based on another artificial key column). • The datetime column in the Time table should be at day granularity (without fractions of a day).
If you don’t follow the rules... • As we’ve already seen, not having complete years makes it hard to do relative time calculations • In my example, since I joined on a surrogate key and not a DateTime key, I needed to add All(DimDate) to my calculation • You can’t use RELATED() to bring the date down to the fact table either, because this causes a circular reference error! • Need to either alter the underlying relational table or view, or import the date dimension table twice
Year to Date • Luckily there are many built-in time intelligence functions for common calculations • Eg for doing year-to-date sums, we have TotalYTD: • =TOTALYTD(SUM(FactInternetSales[SalesAmount]),DimDate[FullDateAlternateKey], ALL(DimDate)) • In this case, TotalYTD is a variant of Calculate with some filters set automatically
Total to Date • A total-to-date gets the running total from the first date we have in our data • No built-in function for it • We need to use Calculate and set the filter to all dates from the first ever date to the last date in the current context • We can use DATESBETWEEN to get this date range • The first date can be got by using BLANK() • The last date with the LASTDATE() function
Total to Date • The final version is: =CALCULATE( SUM(FactInternetSales[SalesAmount]) , DATESBETWEEN('DimDate'[FullDateAlternateKey] , BLANK() , LASTDATE( 'DimDate'[FullDateAlternateKey])) , All('DimDate'))
Values() – like CurrentMember but better! • The Values() function acts like the MDX CurrentMember function • But it is better: Values() returns a table, so it handles multiselect • Although when the table it returns only contains one row we can still do a direct comparison with another value • The Distinct() function works exactly the same way as Values() but: • Values() will return the Unknown Member • Distinct() will not
Distinct Count • To find a distinct count, we need to count the rows in a table containing only the distinct values from a column • The Values() function returns such a table • Because it’s a table, all we need to do is count the number of rows in it: • =COUNTROWS(VALUES(Consultants[Skill]))
Final Thoughts • DAX does the easy stuff very easily • DAX does the medium-hard stuff well too • Not sure if power users will get it, but • It’s easier than MDX • It’s more SQL-like, so more developers will get it • It’s more elegant than MDX in many ways, egmultiselect • DAX does not do the really hard stuff well at all • So for financial apps, for example, MDX still wins • But for how long? • I like DAX a lot
Links • http://blogs.msdn.com/powerpivot/ • http://www.powerpivot.com/ • http://powerpivotgeek.com • http://powerpivottwins.com/ • http://powerpivotpro.com/ • http://www.powerpivot-info.com/ • http://cwebbbi.spaces.live.com • http://social.msdn.microsoft.com/Forums/en-US/sqlkjpowerpivotforexcel/threads • http://www.business-intelligence.kdejonge.net/ • http://sqlblog.com/blogs/marco_russo/
P/X001 How to Get Full Access to a Database Backup in 3 Minutes or Less Idera P/L001 End-to-end database development has arrived Red Gate P/L002 Weird, Deformed, and Grotesque –Horrors Stories from the World of IT Quest P/L005 Expert Query Analysis with SQL Sentry SQLSentry P/T007 Attunity Change Data Capture for SSIS Attunity Coming up… • #SQLBITS