This quickstart guide will walk you through setting up and using TopK in less than 5 minutes. By the end, you’ll know how to:

1

Create a collection

Create a collection that stores your data.

2

Insert initial data

Add documents to your collection.

3

Search your collection

Learn how to use various searching methods:

  • Semantic search — finding content by semantic meaning
  • Keyword search — finding exact text matches and sorting by keyword score
  • Metadata filtering — narrowing down results by document properties
  • Reranking (optional) — use built-in reranking to boost relevant results

Follow along to build your first functional search experience with TopK!

Before starting, ensure you have the TopK SDK installed and your API key ready - see the Installation docs.

1. Create your first collection

Collections are the core data structure in TopK. They store documents and provide an interface for querying your data.

First, initialize a TopK Client instance with your API key and . Check out the Regions page for more information.

from topk_sdk import Client

client = Client(api_key="YOUR_TOPK_API_KEY", region="aws-us-east-1-elastica")

Then, create a collection by specifying a schema. The example below creates a books collection with a semantic index on the title field.

from topk_sdk.schema import text, semantic_index

client.collections().create(
    "books",
    schema={
        "title": text().required().index(semantic_index()),
    },
)

Adding a semantic_index() to the title field enables semantic search as well as keyword search on this field.

Fields that aren’t defined in the schema can still be upserted.

2. Add documents

After creating a collection, you can start adding documents to it.

Documents in TopK are JSON-style dictionaries which must must have an _id field:

client.collection("books").upsert(
    [
        {"_id": "gatsby", "title": "The Great Gatsby"},
        {"_id": "1984", "title": "1984"},
        {"_id": "catcher", "title": "The Catcher in the Rye"}
    ],
)

3. Query your collection

Now, run your first semantic search query:

from topk_sdk.query import select, fn, match, field

results = client.collection("books").query(
    select(
        "title",
        # fn.semantic_similarity() scores documents by how semantically similar their titles are to "classic American novel"
        title_similarity=fn.semantic_similarity("title", "classic American novel"),
    )
    # Sort results by `title_similarity` (computed in the `select()` function above), and limit to top 10 results
    .topk(field("title_similarity"), 10)
    # Rerank results using the built-in reranking model
    .rerank()
)

Optionally, you can call .rerank() at the end of your query to automatically improve relevance of the results using a reranking model. See Reranking guide for more details.

4. (Optional) Clean up

To delete the entire collection:

client.collections().delete("books")