Best practices in Snowflake for Power BI

Improve performance of your reports

To optimize Snowflake storage for efficient Power BI data extraction, focus on structuring and managing your Snowflake tables in a way that aligns with Power BI’s querying patterns, minimizes computing costs, and ensures fast performance.

Use Snowflake Connector 2.0

  • Why:
    1. The Snowflake Connector 2.0 (based on the ADBC driver) offers significant performance improvements over the legacy ODBC driver, including faster data retrieval and reduced metadata calls.
  • How:
    1. Ensure your Power BI Desktop is updated to the latest version (post-July 2025 GA).
    2. Verify the connection uses Implementation 2.0 in the advanced settings if not selected by default.

Leverage Fabric Mirroring

  • Why:
    1. Mirroring Snowflake in Microsoft Fabric allows you to replicate data into OneLake in near-real-time without building ETL pipelines. This provides Direct Lake performance (faster than DirectQuery) without the latency.
  • How:
    1. Configure Mirroring in your Fabric workspace to replicate Snowflake tables.
    2. Connect Power BI to the mirrored OneLake data using Direct Lake mode for blazing fast performance.

Using a Star Schema Design

  • Why:
    1. Power BI performs best with dimensional models like star schemas, which separate fact tables (containing transactional data) and dimension tables (containing descriptive data). This reduces query complexity and improves performance.
  • How:
    1. Store large, frequently updated transactional data in a central fact table.
    2. Keep smaller, less volatile reference data (e.g., categories, dates, or customer details) in dimension tables.
    3. Ensure relationships are well-defined in Snowflake (even though Power BI will handle joins in its model).

Leverage Clustering Keys

  • Why:
    1. Clustering organizes data in Snowflake’s micro-partitions based on specific columns, reducing the amount of data scanned during Power BI queries (partition pruning).
  • How:
    1. Identify columns commonly used in Power BI filters or joins (e.g., `date`, `customer_id`, or `product_id`).
    2. Set a clustering key on these columns using `ALTER TABLE CLUSTER BY (column1, column2)`.

Use Dynamic Tables & Materialized Views

  • Why:
    1. Pre-aggregating data reduces the compute load on Snowflake during report rendering. Dynamic Tables simplify the data engineering pipeline for continuous transformations.
  • How:
    1. Use Materialized Views for simple aggregations (e.g., SUM(sales)) that need instant consistency.
    2. Use Dynamic Tables for complex multi-table joins and transformations that can tolerate slight staleness (defined by target lag).

Implement Query Tagging

  • Why:
    1. To accurately attribute costs to specific Power BI reports or workspaces in Snowflake.
  • How:
    1. Use the Query Tag feature in the Snowflake connector (supported in V2) to pass context (e.g., Report Name, Workspace ID) with every query.

Align Storage Mode with Power BI Usage

  • Why:
    1. Power BI supports Import, DirectQuery, and Composite modes, each with different storage implications in Snowflake.
  • How:
    1. Import Mode: Best for performance. Pre-aggregate data in Snowflake to reduce the dataset size.
    2. DirectQuery Mode: Use for real-time requirements. Optimize with clustering and materialized views.
    3. Composite Mode: Store large fact tables in DirectQuery and dimensions in Import.

Implement Incremental Data Loading

  • Why:
    1. Power BI’s incremental refresh feature works best when Snowflake tables are structured to support efficient updates.
  • How:
    1. Add a last_updated timestamp column to track changes.
    2. Ensure this column is part of the clustering key to speed up the range queries generated by Power BI.

References

Best practices for creating enterprise-wide knowledge bots

Improve your enterprise-wide knowledge bot's performance, security, and design with our best practices

Read More →