Best practices in Snowflake for Power BI
Improve performance of your reports
To optimize Snowflake storage for efficient Power BI data extraction, focus on structuring and managing your Snowflake tables in a way that aligns with Power BI’s querying patterns, minimizes computing costs, and ensures fast performance. Below are some suggestions.
- Using a Star Schema Design
- Leverage Clustering Keys
- Partition Large Tables
- Optimize Storage with Compression
- Use Materialized Views for Pre-Aggregated Data
- Enable Search Optimization for High-Cardinality Lookups
- Align Storage Mode with Power BI Usage
- Implement Incremental Data Loading
- Minimize Data Redundancy
- Monitor and Tune with Snowflake Features
Using a Star Schema Design
- Why:
- Power BI performs best with dimensional models like star schemas, which separate fact tables (containing transactional data) and dimension tables (containing descriptive data). This reduces query complexity and improves performance.
- How:
- Store large, frequently updated transactional data in a central fact table.
- Keep smaller, less volatile reference data (e.g., categories, dates, or customer details) in dimension tables.
- Ensure relationships are well-defined in Snowflake (even though Power BI will handle joins in its model).
Leverage Clustering Keys
- Why:
- Clustering organizes data in Snowflake’s micro-partitions based on specific columns, reducing the amount of data scanned during Power BI queries.
- How:
- Identify columns commonly used in Power BI filters or joins (e.g., `date`, `customer_id`, or `product_id`).
- Set a clustering key on these columns using
`ALTER TABLE
.CLUSTER BY (column1, column2)`
Partition Large Tables
- Why:
- Partitioning logically organizes data, making it faster to retrieve specific subsets in Power BI, especially with DirectQuery or incremental refresh.
- How:
- Use a date-based column (e.g.,
transaction_date
) to partition data naturally in Snowflake. - Combine this with clustering to align physical storage with query patterns (e.g., partition by year or month if Power BI reports focus on time-based analysis).
Optimize Storage with Compression
- Why:
- Snowflake automatically compresses data, but you can enhance this by structuring tables to maximize compression efficiency, reducing storage costs and speeding up data retrieval.
- How:
- Store columns with low cardinality (e.g., status flags, categories) as smaller data types (e.g.,
VARCHAR(10)
instead ofVARCHAR(255)
). - Avoid over-normalization; keep related data together in wider tables where it makes sense for Power BI’s needs.
Use Materialized Views for Pre-Aggregated Data
- Why:
- Power BI often requires aggregated data for visualizations. Materialized views store pre-computed results, reducing query time and compute usage.
- How:
- Create materialized views for common aggregations (e.g.,
SUM(sales) BY date, region
). - Example:
CREATE MATERIALIZED VIEW sales_summary AS SELECT date, region, SUM(sales) FROM fact_sales GROUP BY date, region;
- Use these views as the source for Power BI instead of querying raw tables.
Enable Search Optimization for High-Cardinality Lookups
- Why:
- If Power BI queries involve filtering on high-cardinality columns (e.g., unique IDs), search optimization can speed up point lookups.
- How:
- Enable it on specific columns:
ALTER TABLE
(Best for dimension tables where Power BI performs frequent lookups).ADD SEARCH OPTIMIZATION ON (column_name);
Align Storage Mode with Power BI Usage
- Why:
- Power BI supports Import, DirectQuery, and Composite modes, each with different storage implications in Snowflake.
- How:
- Import Mode: Pre-aggregate data in Snowflake to reduce the dataset size Power BI pulls. Use views or tables with only the necessary columns and rows.
- DirectQuery Mode: Keep raw data in Snowflake and optimize with clustering and materialized views to handle live queries efficiently.
- Composite Mode: Store large fact tables in DirectQuery mode and smaller dimension tables in Import mode within Power BI, leveraging Snowflake’s scalability for the former.
Implement Incremental Data Loading
- Why:
- Power BI’s incremental refresh feature works best when Snowflake tables are structured to support efficient updates.
- How:
- Add a last_updated timestamp column to track changes.
- Partition data by this column and use it in Power BI’s incremental refresh policy.
- Example:
WHERE last_updated >= RangeStart AND last_updated < RangeEnd
Minimize Data Redundancy
- Why:
- Redundant data increases storage costs and can slow down queries if Power BI pulls unnecessary rows.
- How:
- Use views or dynamic tables in Snowflake to transform data on-the-fly instead of duplicating it in separate tables.
- Remove unused columns from tables or create leaner views specifically for Power BI.
Monitor and Tune with Snowflake Features
- Why:
- Snowflake’s built-in tools can help you identify and resolve inefficiencies in storage and querying.
- How:
- Use
SHOW TABLES and DESCRIBE TABLE
to review storage usage and column definitions. - Analyze query performance in Snowflake’s Query History to see which Power BI queries scan the most data, then adjust clustering or add materialized views accordingly.
- Set up automatic suspension for warehouses used by Power BI to minimize idle compute costs.

Best practices for creating enterprise-wide knowledge bots
Improve your enterprise-wide knowledge bot's performance, security, and design with our best practices
Read More →
Last updated: April 17, 2025