> ## Documentation Index
> Fetch the complete documentation index at: https://docs.topk.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Vector search

TopK is built for high-performance dense vector search workloads. It is designed to:

* Maintain **>98% recall**, reducing the likelihood of missing relevant results in applications such as recommendation systems, image search, and semantic search.
* Deliver consistent **low latency** (p99 \< 50 ms). See the [benchmarks](https://www.topk.io/benchmarks) for details.
* Support **large-scale single-collection** deployments as well as **multi-tenant** architectures.

## Define a collection schema with a vector field

Define a schema with a vector field and add a [`vector_index()`](/sdk/topk-py/schema#vector_index):

<CodeGroup>
  ```python Python theme={null}
  from topk_sdk.schema import text, f32_vector, vector_index

  client.collections().create(
      "books",
      schema={
          "title": text().required(),
          "title_embedding": f32_vector(dimension=1536).required().index(vector_index(metric = "cosine")),
      },
  )
  ```

  ```typescript Javascript theme={null}
  import { text, f32Vector, vectorIndex } from "topk-js/schema";

  await client.collections().create("books", {
    title: text().required(),
    title_embedding: f32Vector({ dimension: 1536 }).required().index(vectorIndex({ metric: "cosine" })),
  });
  ```
</CodeGroup>

Supported vector field types:

* **[`f32_vector()`](/sdk/topk-py/schema#f32_vector)** — Dense float32 embeddings (most common)
* **[`u8_vector()`](/sdk/topk-py/schema#u8_vector)** — Quantized uint8 embeddings
* **[`i8_vector()`](/sdk/topk-py/schema#i8_vector)** — Signed quantized int8 embeddings
* **[`binary_vector()`](/sdk/topk-py/schema#binary_vector)** — Binary embeddings (use with `hamming` metric)
* **[`f32_sparse_vector()`](/sdk/topk-py/schema#f32_sparse_vector)** — Sparse embeddings (see [sparse vector search guide](/concepts/sparse-vector-search))
* **[`u8_sparse_vector()`](/sdk/topk-py/schema#u8_sparse_vector)** — Quantized sparse embeddings
* **[`matrix()`](/sdk/topk-py/schema#matrix)** — Multi-vector embeddings (see [multi-vector search guide](/concepts/multi-vector-search))

See the [schema reference](/sdk/topk-py/schema) for full API details.

## Perform a k-Nearest Neighbor (kNN) search

To retrieve the top-k nearest neighbors of a query vector, use the [`fn.vector_distance()`](/sdk/topk-py/query#vector_distance) function.

`fn.vector_distance()` computes the distance (or similarity) between a stored vector field and a query vector, based on the distance metric configured in the vector index (e.g., cosine or Euclidean).

You can use the computed value to sort and return the closest matches.

<CodeGroup>
  ```python Python theme={null}
  from topk_sdk.query import select, field, fn

  docs = client.collection("books").query(
      select(
          "title",
          published_year=field("published_year"),
          # Compute vector similarity between the vector embedding of the string "epic fantasy adventure"
          # and the embedding stored in the `title_embedding` field.
          title_similarity=fn.vector_distance("title_embedding", [0.1, 0.2, 0.3, ...]),
      )
      # Return top 10 results
      # sort: smaller euclidean distance = closer; larger cosine similarity = closer
      # if using euclidean distance, sort in ascending order(asc=True)
      .topk(field("title_similarity"), 10)
  )

  # Example results:
  [
    {
      "_id": "2",
      "title": "Lord of the Rings",
      "title_similarity": 0.8150404095649719
    },
    {
      "_id": "1",
      "title": "The Catcher in the Rye",
      "title_similarity": 0.7825378179550171,
    }
  ]
  ```

  ```js Javascript theme={null}
  import { select, field, fn } from "topk-js/query";

  const docs = await client.collection("books").query(
    select({
      title: field("title"),
      published_year: field("published_year"),
      title_similarity: fn.vectorDistance(
        "title_embedding",
        // Compute vector similarity between the vector embedding of the string "epic fantasy adventure"
        // and the embedding stored in the `title_embedding` field.
        [0.1, 0.2, 0.3 /* ... */]
      ),
    }).topk(field("title_similarity"), 10)
    // sort: smaller euclidean distance = closer; larger cosine similarity = closer
    // if using euclidean distance, sort in ascending order(.topk(field("title_similarity"), 10, true))
  );

  // Example results:
  [
    {
      _id: '2',
      title: 'Lord of the Rings',
      title_similarity: 0.8150404095649719
    },
    {
      _id: '1'
      title_similarity: 0.7825378179550171,
      title: 'The Catcher in the Rye',
    }
  ]
  ```
</CodeGroup>

Let's break down the example above:

1. Compute the cosine similarity between the query embedding and the `title_embedding` field using the `vector_distance()` function.
2. Store the computed cosine similarity in the `title_similarity` field.
3. Return the top 10 results sorted by the `title_similarity` field in a descending order.

## Combine vector search with metadata filtering

Vector search can be combined with metadata filtering by adding a [`filter()`](/sdk/topk-py/query#filter) stage to the query:

<CodeGroup>
  ```python Python theme={null}
  from topk_sdk.query import select, field, fn

  docs = client.collection("books").query(
      select(
          "title",
          title_similarity=fn.vector_distance("title_embedding", [0.1, 0.2, 0.3, ...]),
          published_year=field("published_year"),
      )
      .filter(field("published_year") > 2000)
      .topk(field("title_similarity"), 10)
  )
  ```

  ```js Javascript theme={null}
  import { select, field, fn } from "topk-js/query";

  const docs = await client.collection("books").query(
    select({
      title: field("title"),
      title_similarity: fn.vectorDistance(
        "title_embedding",
        [0.1, 0.2, 0.3 /* ... */]
      ),
      published_year: field("published_year"),
    })
      .filter(field("published_year").gt(2000))
      .topk(field("title_similarity"), 10)
  );
  ```
</CodeGroup>
