DAX Best Practice Guide
This guide enables you to speed up your Power BI reports by optimizing their back-end code. As the 2021 Microsoft Power BI Partner of the Year, we are recognized for our expertise in implementing business intelligence and analytics solutions.
We have compiled these best practices based on our experience, including:
Before you start:
Tip #1: Always clear your DAX cache before optimizing DAX
Your DAX cache builds up from internal VertiPaq queries. You can clear your cache from within DAX Studio. Resetting your cache enables you to effectively measure performance gains.
Improving DAX Syntax
- Use DAX Formatter to format your code
- Use the DISTINCT() and VALUES() functions consistently
- DISTINCT(): Does not return any blanks added due to an integrity violation. The DISTINCT() function includes a blank only if it is part of the original data.
- VALUES(): Returns both blanks from the original data and blanks that Power BI added due to referential integrity violations.
- Add column and measure references in your DAX expressions
Formatted code is easier to read and maintain. DAX Formatteris a free tool that enables you to transform raw DAX into readable code.
Power BI adds a blank value to a column if it finds a referential integrity violation. When making direct queries, Power BI adds a blank value to columns because it cannot check for violations.
The DISTINCT() and VALUES() functions are different:
Be consistent in your use of the DISTINCT() and VALUES() functions throughout the entire report. Otherwise, you’ll have inconsistent values for blank columns. We recommend using the VALUES() function if blank values are not an issue.
To ensure your DAX can be understood and used by anyone, you need to eliminate ambiguity. By adding column and measure references, you ensure that anyone can easily read your DAX at a glance. We recommend always using fully qualified column references and never using fully qualified measure references. That way, you can quickly differentiate between a column or a measure based on whether it’s fully qualified.
Adding column and measure references also ensures that expressions work when a measure home table is changed.
With col reference:
Profit = Orders[Sales] - Orders[Cost]
Without col reference:
Profit = [Sales] - [Cost]
Optimizing DAX Functions
- Use ISBLANK() instead of =Blank() check
- Use = 0 instead of checking for ISBLANK() || = 0
- Use SELECTEDVALUE() instead of HASONEVALUE()
- Use SELECTEDVALUE() instead of VALUES()
- Use variables instead of repeating measures inside the IF branch
- Use DIVIDE() instead of /
- Use KEEPFILTERS() instead of FILTER(T)
- Use FILTER(all(ColumnName)) instead of FILTER(values()) or FILTER(T)
- Use COUNTROWS instead of COUNT
- It's more efficient, and will perform better
- It doesn't consider BLANKs
- The formula intention is clearer and self-descriptive
- Use SEARCH() with the last parameter
- ALL vs. ALLExcept
Use the built-in function ISBLANK() to check for any blank values instead of using the comparison operator = Blank(). While = Blank() returns ‘True’ value for either blank values or empty strings, IsBlank exclusively checks for blanks.
The BLANK value in Power BI is associated with the base value of a column’s data type. The BLANK value corresponds to “0” for integers, “(empty string)” for string columns, and “1–1–1900” for date fields.
ISBLANK() || = 0 enacts two checks: first it checks if a column is BLANK, then it checks for zeroes. = 0 performs both checks at once, improving calculation speed.
To check exclusively for zero, use the IN operator.
It is common practice to use HASONEVALUE() to check if there is only one value present in a column after applying slicers and filters. However, when you do this, you also have to use the VALUES(ColumnName) DAX function to retrieve that single value.
SELECTEDVALUE() performs the above steps internally. It automatically retrieves the single value if there is one and returns a blank if there are multiple values available.
The VALUES() function returns an error if it encounters multiple values. Often, users address the error using Error functions, which negatively affects performance.
Instead of using VALUES(), use SELECTEDVALUE(). The SELECTEDVALUE() function returns a blank if it encounters multiple values (instead of an error).
Ratio = IF([Total Rows] > 10, SUM(Revenue) /[Total Rows], 0)
Here, measures are calculated continuously, meaning the [Total Rows] expression is calculated twice: first for the condition check, then for the true condition expression.
VAR totalRows = [Total Rows]; Ratio = IF(totalRows > 10, SUM(Revenue) / totalRows,0)
Instead of calculating the same expression multiple times, you can store the resulting measure value in a variable. You can use a variable reference wherever required. The same variable process applies to all instances where you call the same measure. Variables can help you avoid repetitive functions.
Note: Be aware that variables are actually constants.
/ raises an exception if the denominator is zero. The DIVIDE() function internally performs a check to validate whether the denominator is zero. If it is, it returns the value specified in a third (extra) parameter.
For “invalid denominator” cases, you need to use the IF condition when using the “/” operator. The DIVIDE() function performs IF checks internally.
Note: If you are certain the denominator value is not zero, it is better to use the “/” operator without an IF check.
The FILTER function overrides any existing set of filters on a column applied via slicers. The KEEPFILTER function does not override the existing set of filters. Instead, it uses the intersection of values present in both, thus maintaining the current context. Use it when you want to maintain any filters applied by slicers or at a report level while performing calculations.
To calculate measures independent of any filters applied to a column, combine the All(ColumnName) function with the FILTER function instead of using Table or VALUE(). For example:
CALCULATE([Total Sales], FILTER(ALL(Products[Color]), Color = ‘Red’))
Use ALL along with the FILTER function if there is no need to keep the current context. Directly applying filters using expressions instead of the FILTER function behaves in the same way as mentioned above. This method internally translates using the ALL function in the filter. For example:
CALCULATE([Total Sales], FILTER(ALL(Products[Color]), Color = ‘Red’))
It is always better to apply filters to the desired column than to the whole table, as this can easily scale.
In Power BI, you can either use the COUNT function to count column values, or the COUNTROWS function to count table rows. Both functions achieve the same result if the counted column contains no BLANKs.
COUNTROWS is usually the better option for three reasons:
Sales Orders = COUNT(Sales [OrderDate])
Sales Orders = COUNTROWS(Sales)
The SEARCH() DAX function accepts the last parameter as the value that the query must return if the search string is not found. You should always use the SEARCH() function instead of using Error functions along with SEARCH().
ALLEXCEPT() behaves exactly like ALL(), VALUES() as long as the “exempted” columns are columns on the pivot. ALLEXCEPT() does NOT preserve pivot context on columns that are not on the pivot. Use ALL() instead of ALLEXCEPT() when using VALUES()
Common Mistakes To Avoid
- Do not change BLANK values to zeros or other strings
- Use (a-b)/b along with variables instead of a/b — 1 or a/b*100–100
- Stop using IFERROR() and ISERROR()
- Do not use scalar variables in SUMMARIZE()
- Avoid using the AddColumns() function inside measure expressions
- Check if you can convert your column to a Boolean column
- Avoid filtering on string columns
- Work upstream, if possible
It is common practice to replace blanks with zeros or other strings. However, Power BI automatically filters all rows with blank values. When viewing results from tables with large amounts of data, this limits the result set and improves performance.
If you replace blanks, Power BI does not filter the unwanted rows, negatively affecting performance.
It is common practice to use a/b — 1 to calculate a ratio and avoid duplicate measure calculations. However, you can achieve the same performance by using variables and using (a-b)/b to calculate the ratio.
If both a and b are blank values, then (a-b)/b returns a blank value and Power BI will filter the values out. a/b — 1 would return -1 as the result because both a and b are integers.
The IFERROR() and ISERROR() functions were widely used in Excel when applying the FIND() and SEARCH() functions. They were necessary because FIND() and SEARCH() returned errors if the query did not obtain the required result.
The IFERROR() and ISERROR() functions force the Power BI engine to perform a step-by-step execution of each row to check for errors. There is currently no method to directly state which row returned the error.
The FIND() and SEARCH() DAX functions provide an extra parameter that the query can pass. The parameter is returned if the search string is not present. The FIND() and SEARCH() DAX functions check if more than one value is returned. They also ensure nothing is divided by zero.
You can avoid using the FIND() and SEARCH() DAX functions altogether by using situationally appropriate DAX functions such as DIVIDE() and SELECTEDVALUE(). The DIVIDE() and SELECTEDVALUE() functions perform error check internally and return the expected results.
Remember: You can always use DAX expressions in such a way that they never return an error.
The SUMMARIZE() function is traditionally used to group columns and return resulting aggregations. However, the SUMMARIZECOLUMNS() function is newer and more optimized. Use that instead.
Only use SUMMARIZE() for grouped elements of a table that don’t have any associated measures or aggregations. For example:
SUMMARIZE(Table, Column1, Column2)
Measures are calculated iteratively by default. If measure definitions use iterative functions such as AddColumns(), Power BI creates nested iterations, which negatively affect report performance.
If there are only two distinct values in a column, check if the column can be converted to use a Boolean data type (true/false). Boolean data types speed up processing when you have a large number of rows.
Instead, use ID columns for filtering. For example, if you need to filter by sales location, assign each location a numeric ID. This means you filter by integer columns rather than string columns. Now, you can take advantage of the VertiPaq engine, which uses value encoding to reduce the memory of a column.
Note: Value encoding only works on integers.
If certain calculations require complex DAX formulae, or if multiple filters are applied repeatedly in DAX measures and calculations, consider creating calculated columns or flags in the back end.
- Data Analysis Expressions (DAX) Reference – Microsoft Corporation, updated regularly
- DAX function reference – Microsoft Corporation, published July 28, 2021
- Optimization guide for Power BI – Microsoft Corporation, published April 2, 2021
Azure Databricks Best Practices
Learn More →