Snowflake’s Indexing Methods

I have various posts on this site that talk to indexes. But Snowflake’s indexing and search approach is quite different from traditional approaches like B-trees. It doesn’t maintain B-tree indexes like row-based relational databases, like MySQL or PostgreSQL.

What are B-Trees?

No User-Created Indexes

Snowflake doesn’t allow users to create their own indexes, such as B-trees or hash structures. Instead, it automatically manages indexing through metadata, optimising queries without manual intervention.

Micro-Partitioning

Data is stored in immutable micro-partitions (50–500 MB compressed).
Each micro-partition stores columnar data and column-level min/max statistics in metadata.
When you query:
- Snowflake’s metadata service prunes partitions that can’t possibly match your query (min/max elimination).
- This is effectively an automatic range index at the partition level.

Table illustrating the logical and physical structure of data partitions in Snowflake, showing micro-partitions and their contents. — Source: Snowflake documentation

Reference: Snowflake Micro-Partitioning Docs

Automatic Clustering & Pruning

Snowflake automatically maintains clustering metadata for all tables.
For large tables, you can define a clustering key to improve pruning.
This is not the same as a B-tree — it’s more like a zone map combined with sorted segments.

A diagram illustrating the logical and physical structure of a Snowflake table, showing original and newly reclustered micro-partitions with corresponding rows and columns labeled for type, name, country, and date. — Source: Snowflake documentation

Reference: Snowflake Clustering Keys

Search Optimization Service

For highly selective point-lookups (e.g., WHERE user_id = 123), Snowflake offers the Search Optimization Service.
It builds secondary search access paths in metadata to speed up equality and small range queries.
Under the hood, this is closer to an inverted index or skip-list than a B-tree.

Reference: Snowflake Search Optimization

No User-Created Indexes

Micro-Partitioning

Automatic Clustering & Pruning

Search Optimization Service

Share this:

Related Posts

DuckLake

Keys Are The Key

WebAssembly in Data Processing

Discover more from Where Data Engineering Meets Business Strategy