Hierarchical Navigable Small World Graphs

When I first came across this algorithm I was intrigued. What did it mean? Working/playing with vector database like Milvus, Pinecone, Weaviate, or Vespa, I saw HNSW mentioned in the index settings.

I discovered this algorithm, Hierarchical Navigable Small World graphs, is the reason modern AI search can find semantically similar items in billions of vectors within a few milliseconds.

In this post, we’ll break down:

  1. The problem HNSW solves
  2. How it works under the hood
  3. Why it’s so fast
  4. Practical considerations in vector databases

The Problem: Nearest Neighbor Search at Scale

Vector search is about finding the nearest neighbors to a query vector in high-dimensional space.

Naïve approach:

  • Compare the query to all N vectors in your dataset
  • Compute a similarity metric (cosine, dot product, Euclidean etc)
  • Return the top-k
  • Complexity: It’s impossible for billions of vectors

Need an index structure that reduces comparisons while keeping accuracy high → Approximate Nearest Neighbor (ANN).

HNSW: A Graph-Based ANN Structure

HNSW builds a multi-layer graph where:

  • Nodes = data points (vectors)
  • Edges = connections to “neighbor” nodes in vector space

Key idea:
Instead of exhaustively checking every vector, walk the graph from a high-level overview down to fine-grained neighborhoods.

The Hierarchical Structure

HNSW organizes data into layers:

  • Top layers: Sparse graph, long-range connections, few nodes
  • Bottom layer: Dense graph, local neighborhood connections

When searching:

  1. Start at the top layer at a random entry point
  2. Greedily move to the neighbor that is closest to the query
  3. Drop down a layer and repeat until reaching layer 0
  4. At layer 0, do a localized search in a candidate set to refine results

This navigable small-world property means you jump across large areas quickly, then zoom in.

Building the Graph

When inserting a vector:

  1. Assign it a random maximum layer (probabilistic skip-list style)
  2. Insert it into each layer up to its max
  3. Connect it to M nearest existing nodes in that layer (parameter M)
  4. Maintain bidirectional edges for navigability

Parameters:

  • M: number of bi-directional connections per node (graph connectivity)
  • efConstruction: size of candidate list during insertion (higher = more accurate but slower build)

Searching the Graph

Parameters:

  • efSearch: size of candidate list during search (higher = better recall, slower)

Steps:

  1. Start at top layer with entry point
  2. Greedy search for the closest node in that layer
  3. Move down one layer, use previous closest node as new entry
  4. At bottom layer (layer 0), explore neighbors until efSearch candidates are evaluated
  5. Return top-k nearest

Time Complexity:

  • In practice: hops
  • Each hop only checks a small fraction of points

Why It’s Fast

  • Logarithmic traversal: Layers reduce search space exponentially
  • Small-world property: Long links allow quick “teleports” across vector space
  • Local refinement: Only a small set of points are compared in detail
  • Cache efficiency: Neighbor lists fit in CPU cache, minimizing memory latency

Example in Practice

# Milvus index config:
index_params = { 
  "index_type": "HNSW", 
  "metric_type": "cosine", 
  "params": {"M": 16, "efConstruction": 200}
} 
collection.create_index("embedding", index_params)

# Search config
search_params = {
  "metric_type": "cosine", 
  "params": {"ef": 64}
} 
results = collection.search([query_vec], "embedding", search_params, limit=10)
  • M=16: 16 connections per node
  • efConstruction=200: Higher build-time accuracy
  • ef=64: Good balance between recall and latency in search

Trade-Offs

  • Memory usage: HNSW keeps all vectors + graph connections in RAM
  • Build time: Higher accuracy means slower indexing
  • Updates: Insertion is possible but frequent deletes can be costly (graph rebalancing)
  • Dimensionality curse: Still suffers at extreme dimensions (>1,000) unless vectors are compressed (e.g., PQ, OPQ)

Beyond HNSW

While HNSW dominates in open-source vector DBs, alternatives exist:

  • IVF-PQ (Inverted File + Product Quantization) – great for disk-based large-scale
  • ScaNN – optimized for TPU/GPU
  • Annoy – Spotify’s tree-based approach, low memory

Many production systems use hybrid strategies — e.g., IVF for partitioning + HNSW for local search.

Closing Thoughts

HNSW is the workhorse behind modern semantic search — powering recommendations, RAG pipelines, and image search across billions of vectors.
It’s not perfect (memory-heavy, CPU-bound), but for in-memory high-recall search, it’s unmatched.

The next wave will likely be disk-backed hybrid HNSW + compression, making it feasible to handle trillions of vectors without sacrificing latency.

Discover more from Data Lingua. Where Data Engineering Meets Agentic Business Strategy

Subscribe now to keep reading and get access to the full archive.

Continue reading