The process of Extract, Transform, and Load, irrespective of the order it happens, i.e. ETL or ELT, plays a key role in the Data integration process and helps businesses to blend data from multiple sources. Matillion is one tool which has been specially designed for data integration and orchestration into cloud data warehouses like Snowflake, Redshift, or Google BigQuery.
With its advanced features and seemingly countless data source connectors, the tool has widely gained its popularity among the data community, especially within Snowflake users.
The article covers information on Matillion for Snowflake billing and some best practices that can be implemented to keep a check on the costs. As of 11th November 2021, subscription and billing is being managed through Matillion Hub. Here are 6 important things you need to know about what factors impact your Matillion bill.
Matillion follows a credit-based billing like Snowflake and cost per credit is decided by the Matillion edition. Initially, the instance comes with a trial version for 14 days with Enterprise edition features. Once the 14-day trial period expires or within the period, the user can opt for any of the 3 Matillion editions below:
Based on the project requirements and budget, the developer can recommend the right edition to the client. Cost per credit can be further reduced by choosing the right payment option.
The yearly upfront payment provides the maximum discount to the bill. If you have a good understanding of the yearly credits required based on the workload and data, this will be the most cost-effective payment approach and can save up to 12.5% of the total cost.
vCore can be seen as the single physical CPU core by the Virtual Machine’s operating system.
Matillion credits are used to pay for the consumption of Virtual Core hours while running Matillion ETL instances. While setting up the instance, the selection of the right number of v-cores is important as it will influence the cost and performance.
If there are multiple parallel jobs with a high amount of data, a Virtual Machine with multiple cores can be used. Matillion recommends 4 v-cores for Enterprise editions that can be scaled from 1 core to 4 cores.
Matillion supports manual scaling of v-cores based on data load scenarios. Virtual Machine sizes can be changed in Azure by VM configurations.
In layman terms, 1 credit can be compared to an underlying 1 v-core Virtual machine uptime of 1 hour. If the VM runs for 10 hours, you will consume 10 credits. An important thing to note is that number of Virtual Machine v-cores needs to be multiplied by the credits. If a machine with 4 v-cores run for 10 hours, credits consumed are 40 instead of 10. The selection of v-cores must be a trade-off between cost and performance.
The developer needs to shrink the Matillion pipeline run window so that all pipeline runs within the given window including a buffer time. Automate the VM on/off process leveraging Azure services.
In Azure, Virtual Machine On/Off can be automated by using Logic Apps or VM jobs. This will start the Matillion instance only when the pipeline is scheduled and shuts it down when idle.
These are some other important aspects to keep in mind before you purchase your Matillion credits.
All Matillion editions allow 5 active users/developers each month with write access. Apart from 5 developers, you can also have as many read-only users as possible without any additional cost.
Additional ‘Active uses’ are available, starting at 50 credits/month. They are charged only if the user is active within the month. In order to add or remove a user, the admin user can perform access management from the Admin -> User configurations in the right top corner of the screen.
Based on the Project Development Lifecycle, after development in Matillion development instance, UAT is performed followed by Deployment and PVT. Once the PVT is complete, one of the steps to reduce Matillion usage is to stop the data refresh schedules in Development and keep that running in Production.
The development Matillion instance can be started on an on-demand basis when further development work is required. An important thing to consider is that the Snowflake development environment will no longer hold the latest data and requires refreshing from the Production data at regular intervals.
The developer needs to come up with an automated approach for the same. The Snowflake database clone option cannot be leveraged in this scenario as it will overwrite any development work in that database.
6.6 Matillion bill savings based on billing frequency
A user can opt to choose any of the below payment options, and it can provide different discount percentages.
Finally, it is important you understand your Matillion bill.
If the total credits consumption is within the annually purchased credits, paid monthly or yearly upfront, the cost per credit is reduced by the discounted amount. If the credit consumption exceeds the purchased credits, then the actual credit cost is used for calculation.
For example, assume that you have purchased 5000 annual credits and your total credit consumption is 6000 for an Enterprise edition, then you will be charged $2.19/credit till 5000 and $2.30/credit after that.
Matillion offers high performance within a reasonable cost, if you can first understand and tweak your cost parameters. It gives the user flexibility and control over the cost. It is highly recommended to understand these aforementioned points during the Analysis & Design phase of the project and plan the implementation accordingly.