Skip to main content
PyPI version The TopK Python library provides convenient access to the TopK API from any Python 3.9+ application. The library includes type definitions for all request params and response fields, and features both synchronous and asynchronous clients.

Installation

pip install topk-sdk

# or

uv add topk-sdk

Prerequisites

Usage

import os
from topk_sdk import Client

client = Client(
    api_key=os.environ.get("TOPK_API_KEY"),
    region="aws-us-east-1-elastica",
)

# Create a dataset
client.datasets().create("my-docs")

# Upload a file
handle = client.dataset("my-docs").upsert_file(
    "doc-1",
    input="/path/to/document.pdf",
    metadata={"source": "internal"},
)

# Wait for the file to process (optional)
client.dataset("my-docs").wait_for_handle(handle.handle)

# Ask a question
answer = client.ask(
    "What was the total net income of Bank of America in 2024?",
    datasets=["my-docs"],
)
print(answer.facts)

Async usage

Simply import AsyncClient instead of Client and use await with each API call:
import os
import asyncio
from topk_sdk import AsyncClient

client = AsyncClient(
    api_key=os.environ.get("TOPK_API_KEY"),
    region="aws-us-east-1-elastica",
)

async def main() -> None:
    await client.datasets().create("my-docs")

    handle = await client.dataset("my-docs").upsert_file(
        "doc-1",
        input="/path/to/document.pdf",
        metadata={"source": "internal"},
    )
    await client.dataset("my-docs").wait_for_handle(handle.handle)

    answer = await client.ask(
        "What was the total net income of Bank of America in 2024?",
        datasets=["my-docs"],
    )
    print(answer.facts)

asyncio.run(main())
Functionality between the synchronous and asynchronous clients is otherwise identical.

Handling errors

from topk_sdk.error import (
    DatasetNotFoundError,
    PermissionDeniedError,
    QuotaExceededError,
    SlowDownError,
)

try:
    client.ask("What was the total net income of Bank of America in 2024?", datasets=["my-docs"])
except DatasetNotFoundError:
    print("Dataset does not exist")
except PermissionDeniedError:
    print("Check your API key")
except QuotaExceededError:
    print("Usage quota exceeded")
except SlowDownError:
    print("Rate limited — the client will retry automatically")
ErrorDescription
CollectionNotFoundErrorCollection does not exist
CollectionAlreadyExistsErrorCollection with this name already exists
CollectionValidationErrorInvalid collection name or schema
DatasetNotFoundErrorDataset does not exist
DatasetAlreadyExistsErrorDataset with this name already exists
DocumentValidationErrorInvalid document
SchemaValidationErrorInvalid schema
PermissionDeniedErrorInvalid or missing API key
QuotaExceededErrorUsage quota exceeded
RequestTooLargeErrorRequest payload too large
SlowDownErrorRate limited by the server (retried automatically)
QueryLsnTimeoutErrorTimed out waiting for write consistency

Retries

The client automatically retries on SlowDownError and on LSN consistency timeouts. Retry behaviour can be configured via RetryConfig:
from topk_sdk import Client, RetryConfig, BackoffConfig

client = Client(
    api_key=os.environ.get("TOPK_API_KEY"),
    region="aws-us-east-1-elastica",
    retry_config=RetryConfig(
        max_retries=5,        # default: 3
        timeout=60_000,       # total retry chain timeout in ms, default: 30,000
        backoff=BackoffConfig(
            init_backoff=200, # default: 100 ms
            max_backoff=5_000, # default: 10,000 ms
        ),
    ),
)

Requirements

Python 3.9 or higher.