Understanding HNSW: The Engine Behind Fast Vector Search

To understand HNSW, imagine walking into a library with millions of books, looking for a specific book based on a brief description of the content. If the books were all piled up randomly, you'd have to check every book individually to find the one that matches your description best, that's brute force. While it would eventually work, it would take forever.

But libraries aren't organized like that. They're structured to naturally guide you to the right section. You walk in and at the top level, you choose whether your book is in the fiction or nonfiction section. Then you narrow it down by genre history, science, or biography. After that, you move to more specific subcategories or alphabetical order until you reach the exact shelf where your book is located.

HNSW works the same way and that intuition is worth holding onto as we go deeper.

HNSW Indexing

HNSW (Hierarchical Navigable Small World) is a graph-based indexing algorithm used for Approximate Nearest Neighbor (ANN) search in high-dimensional vector spaces. Instead of comparing a query against every vector, HNSW organizes vectors into a hierarchical graph structure that enables fast navigation to the nearest neighbors.

The Building Blocks of HNSW

HNSW combines ideas from two powerful data structures: Navigable Small World (NSW) graphs and Skip Lists.

Building block 01

Navigable Small World (NSW)

A Navigable Small World is a type of network or graph that allows for efficient navigation and search through its nodes.

Navigable Small World diagram

Structure

How NSW Search Works

Limitation

Pure NSW can become trapped in local optima because it relies on greedy navigation. HNSW reduces this risk by using hierarchical layers and maintaining multiple candidate paths during search.

Local optimum: A node that appears to be the best choice within its immediate neighborhood, even though a better match exists elsewhere in the graph.
Building block 02

Skip Lists

A Skip List is a data structure that allows fast search, insertion, and deletion operations. It is essentially a multi-level linked list where each level allows "skipping" over multiple elements, reducing the time complexity of operations.

Skip List diagram

How Skip Lists Work

How HNSW Combines Both Ideas

HNSW merges these two concepts into a single structure:


How Search Works in HNSW

The search process follows a top-down approach. Instead of starting with the full graph, the algorithm begins at the highest layer and gradually moves toward lower layers, refining its search at each step.

HNSW diagram
Layer 2 · Coarsest

Finding the Right Region

The search starts at the topmost layer, which contains only a small subset of all vectors. Because there are very few nodes and connections, this layer provides a coarse overview of the dataset.

What happens here

  • Start from the entry point
  • Compare the query vector with the current node and its neighbors
  • Move to a neighboring node if it is closer to the query
  • Repeat until no neighboring node is closer than the current node

The closest node found here becomes the entry point for Layer 1.

Layer 1 · Intermediate

Refining the Search

Layer 1 contains more nodes and connections than Layer 2, providing a more detailed view of the dataset.

What happens here

  • Begin from the entry point passed down from Layer 2
  • Continue the greedy search process
  • Move to neighboring nodes whenever a closer one is found

The closest node found here becomes the entry point for Layer 0.

Layer 0 · Finest

Fine-Grained Search

Layer 0 is the bottom layer and contains every vector in the dataset. It has the highest number of nodes and connections, making it the most detailed representation of the data.

What happens here

  • Start from the entry point received from Layer 1
  • Explore nearby nodes within the most promising region
  • Continue moving toward vectors that are increasingly similar to the query

The algorithm reaches the nearest neighbor (or top-k nearest neighbors) and these are the final results.

At lower layers, HNSW keeps multiple promising candidates instead of following only one path. This broader exploration improves recall and helps avoid missing good neighbors.

How Vectors Are Inserted into HNSW

Understanding how vectors are inserted is just as important as understanding how search works. The quality of the graph built during insertion directly determines how accurate and fast your searches will be later.

Step 1: Assigning a Maximum Layer

When a new vector arrives, HNSW first decides how high up in the hierarchy it will live. This is done by randomly drawing a maximum layer number using an exponential probability distribution. The result:

This randomness is intentional because it's what naturally keeps upper layers sparse and lower layers dense, maintaining the hierarchical structure that makes HNSW fast.

Step 2: Finding the Entry Point (Top Layers)

Starting from the topmost layer of the graph, HNSW runs a greedy search, identical to the query search described earlier, moving toward nodes closest to the new vector. This phase has one goal: find the best entry point for the layer where insertion will actually happen. At each layer above the vector's assigned maximum, HNSW tracks only the single closest node and passes it down. No connections are formed yet.

Step 3: Selecting Neighbors (Insertion Layers)

Once HNSW reaches the vector's assigned maximum layer, the process changes. Instead of tracking just one candidate, it now tracks up to ef_construct candidates simultaneously. From these, HNSW selects the best m neighbors and creates bidirectional edges. This same process repeats going down through each layer until Layer 0.

This is the moment where the parameters you configure actually matter:

Why this matters: Think of insertion as the new vector earning its place in the graph. It navigates down through the layers, identifies its neighborhood, and then permanently wires itself to the best neighbors it can find. The quality of those permanent connections is what ef_construct controls.

This is also why ef_construct is a build-time parameter and cannot be adjusted at query time. Once the graph is built, the connections are fixed. If you build with a low ef_construct, you are stuck with a lower quality graph regardless of how high you set hnsw_ef at search time.

Why This Is Efficient

The key idea is that the upper layers act like shortcuts. Instead of searching the entire graph from the beginning, HNSW quickly identifies the correct region at higher levels and then performs increasingly detailed searches as it moves downward. Think of it like finding a book in a library:

By narrowing the search space at each layer, HNSW achieves extremely fast nearest-neighbor search while maintaining high accuracy.


Configuring HNSW

You can fine-tune how the HNSW graph is built to balance search speed, accuracy, memory usage, and indexing time depending on your use case. There are three key parameters: m, hnsw_ef, and ef_construct.

Parameter Controls Typical Range
m Graph connectivity. Maximum number of connections per node. Higher m = denser graph, better accuracy, more memory. Lower m = sparser graph, faster insertion, lower accuracy. 8 – 64
ef_construct Build thoroughness. Candidates evaluated during insertion. Higher = more comprehensive graph, slower indexing. Lower = faster insertion, less optimal connections. 100 – 500
hnsw_ef Search thoroughness. Candidates evaluated during a query. Higher = more accurate results, slower queries. Lower = faster search, potentially less accurate. Tune this at query time. 50 – 200+
HNSW paramters diagram

Why HNSW Is Fast and Scalable

Sublinear Search Scaling

Unlike O(N) brute force search, HNSW search grows roughly logarithmically with the number of vectors. This makes million-scale datasets searchable in milliseconds rather than minutes.

Filter-Aware

Qdrant extends HNSW with filter-aware indexing (Filterable HNSW), allowing fast searches under structured conditions. This avoids costly full scans when filtering by metadata.

Large-Scale Use

In practice, HNSW search scales close to logarithmically with dataset size, making million-scale searches extremely fast.

HNSW is one of the key innovations that made modern vector search practical at scale. Instead of comparing a query against millions of vectors, it intelligently navigates a hierarchical graph to quickly reach the most relevant results. By combining the navigability of small-world graphs with the hierarchical shortcuts of skip lists, HNSW delivers an impressive balance of speed, accuracy, and scalability.

Whether you're building semantic search, recommendation systems, RAG applications, or large-scale AI products, chances are an HNSW graph is working behind the scenes to make those millisecond searches possible. The next time a vector database returns relevant results almost instantly, you'll know it's not magic — it's smart graph navigation.