Snowflake Costs: 3 Hot Tips To Optimise Your Snowflake Usage Costs

Featured in: /
Published on: March 22, 2022
Written by: Roopak TJ

Snowflake architecture is made of three layers: Storage, Compute and Cloud services. The storage layer consists of compressed and optimised cloud storage, query execution is performed in the Compute layer whereas Cloud services handles managed services like authentication and optimisation.

The credit usage is based on the utilisation of these three layers. Understanding your usage pattern will help you with predicting the monthly bill. Storage cost is straight-forward based on the size of data, whereas compute and cloud service costs can be affected by multiple factors.   

Even if you have complete control and understanding of your credit consumption, there may be some unforeseen scenarios which may burn your credits and incur a steep cost. The following 3 points listed in this blog will help you in such situations.

First the simple stuff:

Tip 1. Virtual Warehouse auto-shutdown and resize

There are 3 main factors contributing to credit consumption using a virtual warehouse:

  • Virtual warehouse runtime
  • Virtual warehouse size 
  • Number of VW clusters

Snowflake recommends enabling auto-shutdown while creating a Virtual Warehouse object. This can also be enabled later using the parameter: 

ALTER WAREHOUSE <WAREHOUSE_NAME> SET
AUTO_SUSPEND = <TIME_IN_SECONDS>;

ALTER WAREHOUSE <WAREHOUSE_NAME> SET
AUTO_RESUME = <TRUE | FALSE>;

Size of the Virtual Machine can be resized manually based on workload. This can be done by the command :

ALTER WAREHOUSE <WAREHOUSE_NAME> SET
WAREHOUSE_SIZE= <SIZE>;

Important Tips

  1. Auto-shut down should not be enabled especially for a reporting or analytics warehouse if queries are executed frequently (for example during business hours) because shutting down the warehouse will mean that the cache has been wiped clean.
  2. ETL tools like Matillion for Snowflake also give the functionality to alter the VM size. This can help with reducing the cost when the pipeline is automated and runs in the background.

Tip 2. Resource Monitor

Compute services usually burn credits faster than storage and it's critical to understand how we can monitor and control the credit usage. Snowflake provides an admin level monitor feature called Resource Monitor which helps the admin, set a budget either at account level or warehouse level for a specific time interval. Resource Monitor can be set using both Snowflake GUI or SQL command by ACCOUNTADMIN only.

Using Snowflake User Interface:

Snowflake Resource Monitor
Snowflake - Create Resource Monitor

Using SQL command:

USE ROLE ACCOUNTADMIN;
CREATE RESOURCE MONITOR
MONTHLY_BUDGET
WITH
CREDIT_QUOTA         =  200
FREQUENCY            =  MONTHLY
START_TIMESTAMP      =  IMMEDIATELY
TRIGGERS
ON 50 PERCENT DO NOTIFY
ON 90 PERCENT DO SUSPEND
ON 99 PERCENT SUSPEND_IMMEDIATE;
ALTER ACCOUNT SET RESOURCE MONITOR = MONTHLY_BUDGET

You can use the below parameter values based on your requirement for the above code

Tip 3. Query timeout parameter

Often during analysis on millions of records, it can be possible that the query is not fine-tuned, and an unwanted aggregation or join can make the virtual warehouse run for an unprecedented amount of time. If the Snowflake SQL script was running in the background from another application, you will only find this out when you receive your month end bill.

Snowflake provides a query parameter ‘STATEMENT_TIMEOUT_IN_SECONDS’ which can be used to avoid this situation. By default, the statement timeout in seconds is 172800 which is approximately 2 days. The parameter can be set for an account, user, session, or individual warehouse as well.

For Warehouse:

ALTER WAREHOUSE SET STATEMENT_TIMEOUT_IN_SECONDS = <num>

For Session:

ALTER SESSION SET STATEMENT_TIMEOUT_IN_SECONDS = <num>

If you feel that the statement execution delay is due to the high traffic or overload on the virtual warehouse, you can use virtual warehouse parameter MAX_CONCURRENCY_LEVEL to increase the number of parallel executions. Snowflake recommends not to use this parameter as increased parallel execution can reduce the performance. ‘STATEMENT_QUEUED_TIMEOUT_IN_SECONDS’ can be used to cancel queries that have been queued for long in a warehouse. By default, there is no timeout value specified and the parameter can be set at both Warehouse and Session level.

Example:

ALTER WAREHOUSE SET STATEMENT_QUEUED_TIMEOUT_IN_SECONDS = <num>

Summary

Snowflake provides complete transparency in its cost through its internal objects like INFORMATION_SCHEMA/ACCOUNT_USAGE views like WAREHOUSE_METERING_HISTORY, STORAGE_USAGE etc. These views can break down the credit consumption based on individual query in real time.

Customised views can be created on top of these views and visualised using tools like PowerBI or Tableau which can help you with monitoring costs.

Copyright © Tridant Pty Ltd.

Privacy Policy
chevron-down