Mastering Snowflake
5 min readMay 22, 2024

--

Thank you for reading my latest article A simple guide to Cortex ML Functions: Anomaly Detection

Here at Medium I regularly write about modern data platforms and technology trends. To read my future articles simply join my network here or click ‘Follow’. Also feel free to connect with me via YouTube.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

In our previous article, we provided a high-level overview of Snowflake Cortex, highlighting its problem-solving capabilities and how it can add value to your organization. This week, we dive deeper into the specifics, focusing on the various functions within the Snowflake Cortex framework, as illustrated in the diagram below.

Snowflake Cortex offers two main types of functions:

  • ML-based functions
  • LLM-based functions

In this newsletter, we’ll concentrate on the ML-based functions, saving the exploration of LLM-based functions for next week.

What are ML Functions?

ML Functions have been developed to package up ML models which help with detecting patterns in data such as forecasting, anomaly detection without the need for specialist skills. You can simply call the functions from within Snowsight using the data stored in Snowflake as an input.

The aim of this approach is to democratize ML to a broader set of data professionals, meaning you don’t need to be a ML developer or dedicated data scientist to generate insights from your data.

What ML functions are available?

There are currently 4 ML functions available, 3 are time-series functions and 1 which doesn’t require time series data. Let’s take a look at these in more detail.

Time Series Functions

These functions leverage ML models which need to be trained on time-series data. The purpose of these models is to evaluate how a target value such as sales varies over time based on historical data.

  • Forecasting predicts future metric values from past trends in time-series data.
  • Anomaly Detection flags metric values that differ from typical expectations.
  • Contribution Explorer helps you find dimensions and values that affect the metric in surprising ways.

Non-time series Function

This function is the only function currently which doesn’t require time-series data. The Classification ML function uses a machine learning model trained to distinguish various types of entities within your data.

  • Classification sorts rows into two or more classes based on their most predictive features.

Anomaly Detection

Next, we’ll focus on the Anomaly Detection Cortex function in Snowflake. We’ll explore what anomaly detection is, why it’s useful, and delve into the specifics of how Snowflake Cortex can help you identify and address outliers in your time series data. Continue reading to learn how this powerful feature can enhance your data quality and operational insights.

What is Anomaly Detection?

Anomaly detection is the process of identifying outliers in data. These outliers are data points that deviate significantly from the expected range. Spotting these anomalies is crucial because they can heavily impact the accuracy of statistics and machine learning models derived from your data. By identifying and removing these outliers, you can improve the overall quality and reliability of your results.

Why is Anomaly Detection Useful?

Detecting anomalies is not just about cleaning data; it’s also about understanding and pinpointing the origin of issues or deviations in processes. For instance, anomaly detection can help you determine when a problem started with your logging pipeline or identify days when your Snowflake compute costs are unexpectedly high. This capability is invaluable for maintaining smooth operations and preemptively addressing potential issues.

Snowflake Cortex and Anomaly Detection

Snowflake Cortex includes a powerful anomaly detection function that allows you to train a model specifically for detecting outliers in your time series data. This function is versatile and works with both single-series and multi-series data. For example, if you have sales data for multiple stores, the model can check each store’s sales independently based on the store identifier.

To use anomaly detection with Snowflake Cortex, your data must include:

  • A timestamp column: This should have a fixed frequency, such as hourly or every 5 minutes.
  • A target column: This represents a quantity of interest at each timestamp.

By leveraging Snowflake Cortex for anomaly detection, you can enhance your data’s integrity and gain deeper insights into the factors affecting your operations. In the next section we’ll dive into a demo to show you how easy it is to use Snowflake and the anomaly Cortex function!

To stay up to date with the latest business and tech trends in data and analytics, make sure to subscribe to my newsletter, follow me on LinkedIn, and YouTube, and, if you’re interested in taking a deeper dive into Snowflake check out my books ‘Mastering Snowflake Solutions and SnowPro Core Certification Study Guide’.

— — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — — —

About Adam Morton

Adam Morton is an experienced data leader and author in the field of data and analytics with a passion for delivering tangible business value. Over the past two decades Adam has accumulated a wealth of valuable, real-world experiences designing and implementing enterprise-wide data strategies, advanced data and analytics solutions as well as building high-performing data teams across the UK, Europe, and Australia.

Adam’s continued commitment to the data and analytics community has seen him formally recognised as an international leader in his field when he was awarded a Global Talent Visa by the Australian Government in 2019.

Today, Adam is dedicated to helping his clients to overcome challenges with data while extracting the most value from their data and analytics implementations. You can find out more information by visiting his website here.

He has also developed a signature training program that includes an intensive online curriculum, weekly live consulting Q&A calls with Adam, and an exclusive mastermind of supportive data and analytics professionals helping you to become an expert in Snowflake. If you’re interested in finding out more, check out the latest Mastering Snowflake details.

--

--

Mastering Snowflake

Our mission is to help people trapped in a career dead end, working with on-premise, legacy technology break into cloud computing by using Snowflake.