Documentation Index
Fetch the complete documentation index at: https://docs.topk.io/llms.txt
Use this file to discover all available pages before exploring further.
Classes
Client
Client for interacting with the TopK API. For available regions see regions
Methods
Constructor
Client(
api_key: str,
region: str,
host: str = "topk.io",
https: bool = True,
retry_config: Optional[RetryConfig | dict[str, Any]] = None
)
Parameters
| Parameter | Type |
|---|
api_key | str |
region | str |
host | str |
https | bool |
retry_config | Optional[RetryConfig | dict[str, Any]] |
collection()
collection(self, collection: str) -> CollectionClient
Get a client for managing data operations on a specific collection such as querying, upserting, and deleting documents.
Parameters
| Parameter | Type |
|---|
collection | str |
Returns
CollectionClient
collections()
collections(self) -> CollectionsClient
Get a client for managing collections.
Returns
CollectionsClient
dataset()
dataset(self, dataset: str) -> DatasetClient
Get a client for managing data operations on a specific dataset such as upserting files, managing metadata, and deleting files.
Parameters
Returns
DatasetClient
datasets()
datasets(self) -> DatasetsClient
Get a client for managing datasets.
Returns
DatasetsClient
ask()
ask(
self,
query: str,
datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
filter: Optional[query.LogicalExpr] = None,
mode: Optional[Literal['auto', 'summarize', 'research']] = None,
select_fields: Optional[Sequence[str]] = None
)
Ask a question and get streaming responses as an iterator.
Parameters
| Parameter | Type |
|---|
query | str |
datasets | Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]] |
filter | Optional[query.LogicalExpr] |
mode | Optional[Literal[‘auto’, ‘summarize’, ‘research’]] |
select_fields | Optional[Sequence[str]] |
Returns
AskIterator
search()
search(
self,
query: str,
datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
top_k: int,
filter: Optional[query.LogicalExpr] = None,
select_fields: Optional[Sequence[str]] = None
)
Search for documents and get streaming responses as an iterator.
Parameters
| Parameter | Type |
|---|
query | str |
datasets | Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]] |
top_k | int |
filter | Optional[query.LogicalExpr] |
select_fields | Optional[Sequence[str]] |
Returns
SearchIterator
AsyncClient
Async client for interacting with the TopK API. For available regions see regions
Methods
Constructor
AsyncClient(
api_key: str,
region: str,
host: str = "topk.io",
https: bool = True,
retry_config: Optional[RetryConfig | dict[str, Any]] = None
)
Parameters
| Parameter | Type |
|---|
api_key | str |
region | str |
host | str |
https | bool |
retry_config | Optional[RetryConfig | dict[str, Any]] |
collection()
collection(self, collection: str) -> AsyncCollectionClient
Get an async client for a specific collection.
Parameters
| Parameter | Type |
|---|
collection | str |
Returns
AsyncCollectionClient
collections()
collections(self) -> AsyncCollectionsClient
Get an async client for managing collections.
Returns
AsyncCollectionsClient
dataset()
dataset(self, dataset: str) -> AsyncDatasetClient
Get an async client for managing data operations on a specific dataset.
Parameters
Returns
AsyncDatasetClient
datasets()
datasets(self) -> AsyncDatasetsClient
Get an async client for managing datasets.
Returns
AsyncDatasetsClient
ask()
ask(
self,
query: str,
datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
filter: Optional[query.LogicalExpr] = None,
mode: Optional[Literal['auto', 'summarize', 'research']] = None,
select_fields: Optional[Sequence[str]] = None
)
Ask a question and get streaming responses asynchronously as an async iterator.
Parameters
| Parameter | Type |
|---|
query | str |
datasets | Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]] |
filter | Optional[query.LogicalExpr] |
mode | Optional[Literal[‘auto’, ‘summarize’, ‘research’]] |
select_fields | Optional[Sequence[str]] |
Returns
AsyncAskIterator
search()
search(
self,
query: str,
datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
top_k: int,
filter: Optional[query.LogicalExpr] = None,
select_fields: Optional[Sequence[str]] = None
)
Search for documents and get streaming responses asynchronously as an async iterator.
Parameters
| Parameter | Type |
|---|
query | str |
datasets | Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]] |
top_k | int |
filter | Optional[query.LogicalExpr] |
select_fields | Optional[Sequence[str]] |
Returns
AsyncSearchIterator
CollectionClient
Synchronous client for collection operations.
Methods
get()
get(
self,
ids: Sequence[str],
fields: Optional[Sequence[str]] = None,
lsn: Optional[str] = None,
consistency: Optional[ConsistencyLevel] = None
)
Get documents by their IDs.
Parameters
| Parameter | Type |
|---|
ids | Sequence[str] |
fields | Optional[Sequence[str]] |
lsn | Optional[str] |
consistency | Optional[ConsistencyLevel] |
Returns
dict[str, dict[str, Any]]
count()
count(self, lsn: Optional[str] = None, consistency: Optional[ConsistencyLevel] = None) -> int
Get the count of documents in the collection.
Parameters
| Parameter | Type |
|---|
lsn | Optional[str] |
consistency | Optional[ConsistencyLevel] |
Returns
int
query()
query(
self,
query: query.Query,
lsn: Optional[str] = None,
consistency: Optional[ConsistencyLevel] = None
)
Execute a query against the collection.
Parameters
| Parameter | Type |
|---|
query | query.Query |
lsn | Optional[str] |
consistency | Optional[ConsistencyLevel] |
Returns
list[dict[str, Any]]
upsert()
upsert(self, documents: Sequence[Mapping[str, Any]]) -> str
Insert or update documents in the collection.
Parameters
| Parameter | Type |
|---|
documents | Sequence[Mapping[str, Any]] |
Returns
str
update()
update(self, documents: Sequence[Mapping[str, Any]], fail_on_missing: Optional[bool] = None) -> str
Update documents in the collection.
Existing documents will be merged with the provided fields.
Missing documents will be ignored.
Returns the LSN at which the update was applied.
If no updates were applied, this will be empty.
Parameters
| Parameter | Type |
|---|
documents | Sequence[Mapping[str, Any]] |
fail_on_missing | Optional[bool] |
Returns
str
delete()
delete(self, expr: Sequence[str] | query.LogicalExpr) -> str
Delete documents by their IDs or using a filter expression.
Example:
Delete documents by their IDs:
client.collection("books").delete(["id_1", "id_2"])
Delete documents by a filter expression:
from topk_sdk.query import field
client.collection("books").delete(field("published_year").gt(1997))
Parameters
| Parameter | Type |
|---|
expr | Sequence[str] | query.LogicalExpr |
Returns
str
AsyncCollectionClient
Asynchronous client for collection operations.
Methods
get()
get(
self,
ids: Sequence[str],
fields: Optional[Sequence[str]] = None,
lsn: Optional[str] = None,
consistency: Optional[ConsistencyLevel] = None
)
Get documents by their IDs asynchronously.
Parameters
| Parameter | Type |
|---|
ids | Sequence[str] |
fields | Optional[Sequence[str]] |
lsn | Optional[str] |
consistency | Optional[ConsistencyLevel] |
Returns
Awaitable[dict[str, dict[str, Any]]]
count()
count(
self,
lsn: Optional[str] = None,
consistency: Optional[ConsistencyLevel] = None
)
Get the count of documents in the collection asynchronously.
Parameters
| Parameter | Type |
|---|
lsn | Optional[str] |
consistency | Optional[ConsistencyLevel] |
Returns
Awaitable[int]
query()
query(
self,
query: query.Query,
lsn: Optional[str] = None,
consistency: Optional[ConsistencyLevel] = None
)
Execute a query against the collection asynchronously.
Parameters
| Parameter | Type |
|---|
query | query.Query |
lsn | Optional[str] |
consistency | Optional[ConsistencyLevel] |
Returns
Awaitable[list[dict[str, Any]]]
upsert()
upsert(self, documents: Sequence[Mapping[str, Any]]) -> Awaitable[str]
Insert or update documents in the collection asynchronously.
Parameters
| Parameter | Type |
|---|
documents | Sequence[Mapping[str, Any]] |
Returns
Awaitable[str]
update()
update(
self,
documents: Sequence[Mapping[str, Any]],
fail_on_missing: Optional[bool] = None
)
Update documents in the collection asynchronously.
Existing documents will be merged with the provided fields.
Missing documents will be ignored.
Returns the LSN at which the update was applied.
If no updates were applied, this will be empty.
Parameters
| Parameter | Type |
|---|
documents | Sequence[Mapping[str, Any]] |
fail_on_missing | Optional[bool] |
Returns
Awaitable[str]
delete()
delete(self, expr: Sequence[str] | query.LogicalExpr) -> Awaitable[str]
Delete documents by their IDs or using a filter expression asynchronously.
Example:
Delete documents by their IDs:
await client.collection("books").delete(["id_1", "id_2"])
Delete documents by a filter expression:
from topk_sdk.query import field
await client.collection("books").delete(field("published_year").gt(1997))
Parameters
| Parameter | Type |
|---|
expr | Sequence[str] | query.LogicalExpr |
Returns
Awaitable[str]
Collection
Represents a collection in the TopK system.
Properties
| Property | Type | |
|---|
name | str | |
org_id | str | |
project_id | str | |
region | str | |
schema | dict[str, schema.FieldSpec] | |
created_at | str | |
Dataset
Represents a dataset in the TopK system.
Properties
| Property | Type | |
|---|
name | str | |
description | Optional[str] | |
org_id | str | |
project_id | str | |
region | str | |
created_at | str | |
ListEntry
Entry in a dataset.
Properties
| Property | Type | |
|---|
id | str | |
name | str | |
size | int | |
mime_type | str | |
status | str | |
status_reason | Optional[str] | |
metadata | dict[str, Any] | |
CollectionsClient
Synchronous client for managing collections.
Methods
get()
get(self, collection_name: str) -> Collection
Get information about a specific collection.
Parameters
| Parameter | Type |
|---|
collection_name | str |
Returns
Collection
list()
list(self) -> list[Collection]
List all collections.
Returns
list[Collection]
create()
create(self, collection_name: str, schema: Mapping[str, schema.FieldSpec]) -> Collection
Create a new collection with the specified schema.
Parameters
| Parameter | Type |
|---|
collection_name | str |
schema | Mapping[str, schema.FieldSpec] |
Returns
Collection
delete()
delete(self, collection_name: str) -> None
Delete a collection.
Parameters
| Parameter | Type |
|---|
collection_name | str |
Returns
None
AsyncCollectionsClient
Asynchronous client for managing collections.
Methods
get()
get(self, collection_name: str) -> Awaitable[Collection]
Get information about a specific collection asynchronously.
Parameters
| Parameter | Type |
|---|
collection_name | str |
Returns
Awaitable[Collection]
list()
list(self) -> Awaitable[list[Collection]]
List all collections asynchronously.
Returns
Awaitable[list[Collection]]
create()
create(self, collection_name: str, schema: Mapping[str, schema.FieldSpec]) -> Awaitable[Collection]
Create a new collection with the specified schema asynchronously.
Parameters
| Parameter | Type |
|---|
collection_name | str |
schema | Mapping[str, schema.FieldSpec] |
Returns
Awaitable[Collection]
delete()
delete(self, collection_name: str) -> Awaitable[None]
Delete a collection asynchronously.
Parameters
| Parameter | Type |
|---|
collection_name | str |
Returns
Awaitable[None]
DatasetsClient
Synchronous client for managing datasets.
Methods
get()
get(self, dataset_name: str) -> Dataset
Get information about a specific dataset.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
Returns
Dataset
list()
list(self) -> list[Dataset]
List all datasets.
Returns
list[Dataset]
create()
create(self, dataset_name: str) -> Dataset
Create a new dataset.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
Returns
Dataset
update()
update(self, dataset_name: str, description: Optional[str] = None) -> Dataset
Update dataset properties.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
description | Optional[str] |
Returns
Dataset
delete()
delete(self, dataset_name: str) -> None
Delete a dataset.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
Returns
None
DatasetClient
Synchronous client for dataset operations.
Methods
upsert_file()
upsert_file(
self,
doc_id: str,
input: os.PathLike[Any] | Tuple[str, bytes, str],
metadata: Mapping[str, Any]
)
Upsert a file to the dataset. Returns the processing handle.
Parameters
| Parameter | Type |
|---|
doc_id | str |
input | os.PathLike[Any] | Tuple[str, bytes, str] |
metadata | Mapping[str, Any] |
Returns
str
get_metadata(
self,
ids: Sequence[str],
fields: Optional[Sequence[str]] = None
)
Get metadata for one or more documents.
Parameters
| Parameter | Type |
|---|
ids | Sequence[str] |
fields | Optional[Sequence[str]] |
Returns
dict[str, dict[str, Any]]
update_metadata(self, doc_id: str, metadata: Mapping[str, Any]) -> str
Update metadata for a file. Returns the processing handle.
Parameters
| Parameter | Type |
|---|
doc_id | str |
metadata | Mapping[str, Any] |
Returns
str
delete()
delete(self, doc_id: str) -> str
Delete a file from the dataset. Returns the processing handle.
Parameters
Returns
str
check_handle()
check_handle(self, handle: str) -> bool
Return whether the handle has been processed.
Parameters
Returns
bool
wait_for_handle()
wait_for_handle(self, handle: str, config: Optional[WaitConfig | dict[str, Any]] = None) -> None
Poll until a handle has been processed or the timeout is reached.
Raises an error if the handle is not processed within the configured timeout.
Parameters
| Parameter | Type |
|---|
handle | str |
config | Optional[WaitConfig | dict[str, Any]] |
Returns
None
list()
list(
self,
fields: Optional[Sequence[str]] = None,
filter: Optional[query.LogicalExpr] = None
)
List files in the dataset as a streaming iterator.
Parameters
| Parameter | Type |
|---|
fields | Optional[Sequence[str]] |
filter | Optional[query.LogicalExpr] |
Returns
DatasetListIterator
AsyncDatasetsClient
Asynchronous client for managing datasets.
Methods
get()
get(self, dataset_name: str) -> Awaitable[Dataset]
Get information about a specific dataset asynchronously.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
Returns
Awaitable[Dataset]
list()
list(self) -> Awaitable[list[Dataset]]
List all datasets asynchronously.
Returns
Awaitable[list[Dataset]]
create()
create(self, dataset_name: str) -> Awaitable[Dataset]
Create a new dataset asynchronously.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
Returns
Awaitable[Dataset]
update()
update(self, dataset_name: str, description: Optional[str] = None) -> Awaitable[Dataset]
Update dataset properties.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
description | Optional[str] |
Returns
Awaitable[Dataset]
delete()
delete(self, dataset_name: str) -> Awaitable[None]
Delete a dataset asynchronously.
Parameters
| Parameter | Type |
|---|
dataset_name | str |
Returns
Awaitable[None]
AsyncDatasetClient
Asynchronous client for dataset operations.
Methods
upsert_file()
upsert_file(
self,
doc_id: str,
input: os.PathLike[Any] | Tuple[str, bytes, str],
metadata: Mapping[str, Any]
)
Upsert a file to the dataset asynchronously. Returns the processing handle.
Parameters
| Parameter | Type |
|---|
doc_id | str |
input | os.PathLike[Any] | Tuple[str, bytes, str] |
metadata | Mapping[str, Any] |
Returns
Awaitable[str]
get_metadata(
self,
ids: Sequence[str],
fields: Optional[Sequence[str]] = None
)
Get metadata for one or more documents asynchronously.
Parameters
| Parameter | Type |
|---|
ids | Sequence[str] |
fields | Optional[Sequence[str]] |
Returns
Awaitable[dict[str, dict[str, Any]]]
update_metadata(self, doc_id: str, metadata: Mapping[str, Any]) -> Awaitable[str]
Update metadata for a file asynchronously. Returns the processing handle.
Parameters
| Parameter | Type |
|---|
doc_id | str |
metadata | Mapping[str, Any] |
Returns
Awaitable[str]
delete()
delete(self, doc_id: str) -> Awaitable[str]
Delete a file from the dataset asynchronously. Returns the processing handle.
Parameters
Returns
Awaitable[str]
check_handle()
check_handle(self, handle: str) -> Awaitable[bool]
Return whether the handle has been processed asynchronously.
Parameters
Returns
Awaitable[bool]
wait_for_handle()
wait_for_handle(
self,
handle: str,
config: Optional[WaitConfig | dict[str, Any]] = None
)
Poll until a handle has been processed or the timeout is reached asynchronously.
Raises an error if the handle is not processed within the configured timeout.
Parameters
| Parameter | Type |
|---|
handle | str |
config | Optional[WaitConfig | dict[str, Any]] |
Returns
Awaitable[None]
list()
list(
self,
fields: Optional[Sequence[str]] = None,
filter: Optional[query.LogicalExpr] = None
)
List files in the dataset as a streaming async iterator.
Parameters
| Parameter | Type |
|---|
fields | Optional[Sequence[str]] |
filter | Optional[query.LogicalExpr] |
Returns
AsyncDatasetListIterator
Source
Represents a dataset with an optional filter.
Properties
| Property | Type | |
|---|
dataset | str | |
filter | Optional[query.LogicalExpr] | |
Fact
Represents a fact in an ask response.
Properties
| Property | Type | |
|---|
fact | str | |
ref_ids | list[str] | |
Chunk
Text chunk content.
Properties
| Property | Type | |
|---|
text | str | |
doc_pages | list[int] | |
Image
Image content.
Properties
| Property | Type | |
|---|
data | bytes | |
mime_type | str | |
Page
Page content with optional image.
Properties
| Property | Type | |
|---|
page_number | int | |
image | Optional[Image] | |
Content
Content in a search result. One of chunk, page, or image.
Properties
| Property | Type | |
|---|
type | Literal[‘chunk’, ‘page’, ‘image’] | |
data | Chunk | Page | Image | |
SearchResult
Represents a search result in an ask response.
Properties
| Property | Type | |
|---|
doc_id | str | |
doc_type | str | |
dataset | str | |
content_id | str | |
doc_name | str | |
content | Optional[Content] | |
metadata | dict[str, Any] | |
Answer
Represents a final answer in an ask response.
Properties
| Property | Type | |
|---|
facts | list[Fact] | |
refs | dict[str, SearchResult] | |
confidence | float | |
Progress
Represents a progress update in an ask response.
Properties
AskIterator
Iterator for synchronous ask responses.
AsyncAskIterator
Iterator for asynchronous ask responses.
SearchIterator
Iterator for synchronous search responses.
AsyncSearchIterator
Iterator for asynchronous search responses.
DatasetListIterator
Iterator for synchronous dataset list responses.
AsyncDatasetListIterator
Iterator for asynchronous dataset list responses.
ConsistencyLevel
Enumeration of consistency levels for operations.
Values
| Value | Description |
|---|
Indexed | indexed |
Strong | strong |
WaitConfig
Configuration for polling when waiting for a handle to be processed.
Properties
| Property | Type | |
|---|
frequency_secs | Optional[int] | How often to poll for the handle status in seconds. Default is 5. |
timeout_secs | Optional[int] | Maximum time to wait before returning a timeout error in seconds. Default is 300. |
Methods
Constructor
WaitConfig(frequency_secs: Optional[int] = None, timeout_secs: Optional[int] = None) -> None
Parameters
| Parameter | Type |
|---|
frequency_secs | Optional[int] |
timeout_secs | Optional[int] |
RetryConfig
Configuration for retry behavior.
By default, retries occur in two situations:
- When the server requests the client to reduce its request rate, resulting in a SlowDownError.
- When using the
query(..., lsn=N) to wait for writes to be available.
Properties
| Property | Type | |
|---|
max_retries | Optional[int] | Maximum number of retries to attempt. Default is 3 retries. |
timeout | Optional[int] | The total timetout for the retry chain in milliseconds. Default is 30,000 milliseconds (30 seconds) |
backoff | Optional[BackoffConfig] | The backoff configuration for the client. |
Methods
Constructor
RetryConfig(
max_retries: Optional[int] = None,
timeout: Optional[int] = None,
backoff: Optional[BackoffConfig] = None
)
Parameters
| Parameter | Type |
|---|
max_retries | Optional[int] |
timeout | Optional[int] |
backoff | Optional[BackoffConfig] |
BackoffConfig
Configuration for backoff behavior in retries.
Properties
| Property | Type | |
|---|
base | Optional[int] | The base for the backoff. Default is 2x backoff. |
init_backoff | Optional[int] | The initial backoff in milliseconds. Default is 100 milliseconds. |
max_backoff | Optional[int] | The maximum backoff in milliseconds. Default is 10,000 milliseconds (10 seconds). |
Methods
Constructor
BackoffConfig(
base: Optional[int] = None,
init_backoff: Optional[int] = None,
max_backoff: Optional[int] = None
)
Parameters
| Parameter | Type |
|---|
base | Optional[int] |
init_backoff | Optional[int] |
max_backoff | Optional[int] |