Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.topk.io/llms.txt

Use this file to discover all available pages before exploring further.

Classes

Client

Client for interacting with the TopK API. For available regions see regions Methods Constructor
Client(
   api_key: str,
   region: str,
   host: str = "topk.io",
   https: bool = True,
   retry_config: Optional[RetryConfig | dict[str, Any]] = None
)
Parameters
ParameterType
api_keystr
regionstr
hoststr
httpsbool
retry_configOptional[RetryConfig | dict[str, Any]]

collection()

collection(self, collection: str) -> CollectionClient
Get a client for managing data operations on a specific collection such as querying, upserting, and deleting documents. Parameters
ParameterType
collectionstr
Returns CollectionClient

collections()

collections(self) -> CollectionsClient
Get a client for managing collections. Returns CollectionsClient

dataset()

dataset(self, dataset: str) -> DatasetClient
Get a client for managing data operations on a specific dataset such as upserting files, managing metadata, and deleting files. Parameters
ParameterType
datasetstr
Returns DatasetClient

datasets()

datasets(self) -> DatasetsClient
Get a client for managing datasets. Returns DatasetsClient

ask()

ask(
   self,
   query: str,
   datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
   filter: Optional[query.LogicalExpr] = None,
   mode: Optional[Literal['auto', 'summarize', 'research']] = None,
   select_fields: Optional[Sequence[str]] = None
)
Ask a question and get streaming responses as an iterator. Parameters
ParameterType
querystr
datasetsSequence[Source] | Sequence[str] | Sequence[dict[str, Any]]
filterOptional[query.LogicalExpr]
modeOptional[Literal[‘auto’, ‘summarize’, ‘research’]]
select_fieldsOptional[Sequence[str]]
Returns AskIterator
search(
   self,
   query: str,
   datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
   top_k: int,
   filter: Optional[query.LogicalExpr] = None,
   select_fields: Optional[Sequence[str]] = None
)
Search for documents and get streaming responses as an iterator. Parameters
ParameterType
querystr
datasetsSequence[Source] | Sequence[str] | Sequence[dict[str, Any]]
top_kint
filterOptional[query.LogicalExpr]
select_fieldsOptional[Sequence[str]]
Returns SearchIterator

AsyncClient

Async client for interacting with the TopK API. For available regions see regions Methods Constructor
AsyncClient(
   api_key: str,
   region: str,
   host: str = "topk.io",
   https: bool = True,
   retry_config: Optional[RetryConfig | dict[str, Any]] = None
)
Parameters
ParameterType
api_keystr
regionstr
hoststr
httpsbool
retry_configOptional[RetryConfig | dict[str, Any]]

collection()

collection(self, collection: str) -> AsyncCollectionClient
Get an async client for a specific collection. Parameters
ParameterType
collectionstr
Returns AsyncCollectionClient

collections()

collections(self) -> AsyncCollectionsClient
Get an async client for managing collections. Returns AsyncCollectionsClient

dataset()

dataset(self, dataset: str) -> AsyncDatasetClient
Get an async client for managing data operations on a specific dataset. Parameters
ParameterType
datasetstr
Returns AsyncDatasetClient

datasets()

datasets(self) -> AsyncDatasetsClient
Get an async client for managing datasets. Returns AsyncDatasetsClient

ask()

ask(
   self,
   query: str,
   datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
   filter: Optional[query.LogicalExpr] = None,
   mode: Optional[Literal['auto', 'summarize', 'research']] = None,
   select_fields: Optional[Sequence[str]] = None
)
Ask a question and get streaming responses asynchronously as an async iterator. Parameters
ParameterType
querystr
datasetsSequence[Source] | Sequence[str] | Sequence[dict[str, Any]]
filterOptional[query.LogicalExpr]
modeOptional[Literal[‘auto’, ‘summarize’, ‘research’]]
select_fieldsOptional[Sequence[str]]
Returns AsyncAskIterator

search()

search(
   self,
   query: str,
   datasets: Sequence[Source] | Sequence[str] | Sequence[dict[str, Any]],
   top_k: int,
   filter: Optional[query.LogicalExpr] = None,
   select_fields: Optional[Sequence[str]] = None
)
Search for documents and get streaming responses asynchronously as an async iterator. Parameters
ParameterType
querystr
datasetsSequence[Source] | Sequence[str] | Sequence[dict[str, Any]]
top_kint
filterOptional[query.LogicalExpr]
select_fieldsOptional[Sequence[str]]
Returns AsyncSearchIterator

CollectionClient

Synchronous client for collection operations. Methods

get()

get(
   self,
   ids: Sequence[str],
   fields: Optional[Sequence[str]] = None,
   lsn: Optional[str] = None,
   consistency: Optional[ConsistencyLevel] = None
)
Get documents by their IDs. Parameters
ParameterType
idsSequence[str]
fieldsOptional[Sequence[str]]
lsnOptional[str]
consistencyOptional[ConsistencyLevel]
Returns dict[str, dict[str, Any]]

count()

count(self, lsn: Optional[str] = None, consistency: Optional[ConsistencyLevel] = None) -> int
Get the count of documents in the collection. Parameters
ParameterType
lsnOptional[str]
consistencyOptional[ConsistencyLevel]
Returns int

query()

query(
   self,
   query: query.Query,
   lsn: Optional[str] = None,
   consistency: Optional[ConsistencyLevel] = None
)
Execute a query against the collection. Parameters
ParameterType
queryquery.Query
lsnOptional[str]
consistencyOptional[ConsistencyLevel]
Returns list[dict[str, Any]]

upsert()

upsert(self, documents: Sequence[Mapping[str, Any]]) -> str
Insert or update documents in the collection. Parameters
ParameterType
documentsSequence[Mapping[str, Any]]
Returns str

update()

update(self, documents: Sequence[Mapping[str, Any]], fail_on_missing: Optional[bool] = None) -> str
Update documents in the collection. Existing documents will be merged with the provided fields. Missing documents will be ignored. Returns the LSN at which the update was applied. If no updates were applied, this will be empty. Parameters
ParameterType
documentsSequence[Mapping[str, Any]]
fail_on_missingOptional[bool]
Returns str

delete()

delete(self, expr: Sequence[str] | query.LogicalExpr) -> str
Delete documents by their IDs or using a filter expression. Example: Delete documents by their IDs:
client.collection("books").delete(["id_1", "id_2"])
Delete documents by a filter expression:
from topk_sdk.query import field

client.collection("books").delete(field("published_year").gt(1997))
Parameters
ParameterType
exprSequence[str] | query.LogicalExpr
Returns str

AsyncCollectionClient

Asynchronous client for collection operations. Methods

get()

get(
   self,
   ids: Sequence[str],
   fields: Optional[Sequence[str]] = None,
   lsn: Optional[str] = None,
   consistency: Optional[ConsistencyLevel] = None
)
Get documents by their IDs asynchronously. Parameters
ParameterType
idsSequence[str]
fieldsOptional[Sequence[str]]
lsnOptional[str]
consistencyOptional[ConsistencyLevel]
Returns Awaitable[dict[str, dict[str, Any]]]

count()

count(
   self,
   lsn: Optional[str] = None,
   consistency: Optional[ConsistencyLevel] = None
)
Get the count of documents in the collection asynchronously. Parameters
ParameterType
lsnOptional[str]
consistencyOptional[ConsistencyLevel]
Returns Awaitable[int]

query()

query(
   self,
   query: query.Query,
   lsn: Optional[str] = None,
   consistency: Optional[ConsistencyLevel] = None
)
Execute a query against the collection asynchronously. Parameters
ParameterType
queryquery.Query
lsnOptional[str]
consistencyOptional[ConsistencyLevel]
Returns Awaitable[list[dict[str, Any]]]

upsert()

upsert(self, documents: Sequence[Mapping[str, Any]]) -> Awaitable[str]
Insert or update documents in the collection asynchronously. Parameters
ParameterType
documentsSequence[Mapping[str, Any]]
Returns Awaitable[str]

update()

update(
   self,
   documents: Sequence[Mapping[str, Any]],
   fail_on_missing: Optional[bool] = None
)
Update documents in the collection asynchronously. Existing documents will be merged with the provided fields. Missing documents will be ignored. Returns the LSN at which the update was applied. If no updates were applied, this will be empty. Parameters
ParameterType
documentsSequence[Mapping[str, Any]]
fail_on_missingOptional[bool]
Returns Awaitable[str]

delete()

delete(self, expr: Sequence[str] | query.LogicalExpr) -> Awaitable[str]
Delete documents by their IDs or using a filter expression asynchronously. Example: Delete documents by their IDs:
await client.collection("books").delete(["id_1", "id_2"])
Delete documents by a filter expression:
from topk_sdk.query import field

await client.collection("books").delete(field("published_year").gt(1997))
Parameters
ParameterType
exprSequence[str] | query.LogicalExpr
Returns Awaitable[str]

Collection

Represents a collection in the TopK system. Properties
PropertyType
namestr
org_idstr
project_idstr
regionstr
schemadict[str, schema.FieldSpec]
created_atstr

Dataset

Represents a dataset in the TopK system. Properties
PropertyType
namestr
descriptionOptional[str]
org_idstr
project_idstr
regionstr
created_atstr

ListEntry

Entry in a dataset. Properties
PropertyType
idstr
namestr
sizeint
mime_typestr
statusstr
status_reasonOptional[str]
metadatadict[str, Any]

CollectionsClient

Synchronous client for managing collections. Methods

get()

get(self, collection_name: str) -> Collection
Get information about a specific collection. Parameters
ParameterType
collection_namestr
Returns Collection

list()

list(self) -> list[Collection]
List all collections. Returns list[Collection]

create()

create(self, collection_name: str, schema: Mapping[str, schema.FieldSpec]) -> Collection
Create a new collection with the specified schema. Parameters
ParameterType
collection_namestr
schemaMapping[str, schema.FieldSpec]
Returns Collection

delete()

delete(self, collection_name: str) -> None
Delete a collection. Parameters
ParameterType
collection_namestr
Returns None

AsyncCollectionsClient

Asynchronous client for managing collections. Methods

get()

get(self, collection_name: str) -> Awaitable[Collection]
Get information about a specific collection asynchronously. Parameters
ParameterType
collection_namestr
Returns Awaitable[Collection]

list()

list(self) -> Awaitable[list[Collection]]
List all collections asynchronously. Returns Awaitable[list[Collection]]

create()

create(self, collection_name: str, schema: Mapping[str, schema.FieldSpec]) -> Awaitable[Collection]
Create a new collection with the specified schema asynchronously. Parameters
ParameterType
collection_namestr
schemaMapping[str, schema.FieldSpec]
Returns Awaitable[Collection]

delete()

delete(self, collection_name: str) -> Awaitable[None]
Delete a collection asynchronously. Parameters
ParameterType
collection_namestr
Returns Awaitable[None]

DatasetsClient

Synchronous client for managing datasets. Methods

get()

get(self, dataset_name: str) -> Dataset
Get information about a specific dataset. Parameters
ParameterType
dataset_namestr
Returns Dataset

list()

list(self) -> list[Dataset]
List all datasets. Returns list[Dataset]

create()

create(self, dataset_name: str) -> Dataset
Create a new dataset. Parameters
ParameterType
dataset_namestr
Returns Dataset

update()

update(self, dataset_name: str, description: Optional[str] = None) -> Dataset
Update dataset properties. Parameters
ParameterType
dataset_namestr
descriptionOptional[str]
Returns Dataset

delete()

delete(self, dataset_name: str) -> None
Delete a dataset. Parameters
ParameterType
dataset_namestr
Returns None

DatasetClient

Synchronous client for dataset operations. Methods

upsert_file()

upsert_file(
   self,
   doc_id: str,
   input: os.PathLike[Any] | Tuple[str, bytes, str],
   metadata: Mapping[str, Any]
)
Upsert a file to the dataset. Returns the processing handle. Parameters
ParameterType
doc_idstr
inputos.PathLike[Any] | Tuple[str, bytes, str]
metadataMapping[str, Any]
Returns str

get_metadata()

get_metadata(
   self,
   ids: Sequence[str],
   fields: Optional[Sequence[str]] = None
)
Get metadata for one or more documents. Parameters
ParameterType
idsSequence[str]
fieldsOptional[Sequence[str]]
Returns dict[str, dict[str, Any]]

update_metadata()

update_metadata(self, doc_id: str, metadata: Mapping[str, Any]) -> str
Update metadata for a file. Returns the processing handle. Parameters
ParameterType
doc_idstr
metadataMapping[str, Any]
Returns str

delete()

delete(self, doc_id: str) -> str
Delete a file from the dataset. Returns the processing handle. Parameters
ParameterType
doc_idstr
Returns str

check_handle()

check_handle(self, handle: str) -> bool
Return whether the handle has been processed. Parameters
ParameterType
handlestr
Returns bool

wait_for_handle()

wait_for_handle(self, handle: str, config: Optional[WaitConfig | dict[str, Any]] = None) -> None
Poll until a handle has been processed or the timeout is reached. Raises an error if the handle is not processed within the configured timeout. Parameters
ParameterType
handlestr
configOptional[WaitConfig | dict[str, Any]]
Returns None

list()

list(
   self,
   fields: Optional[Sequence[str]] = None,
   filter: Optional[query.LogicalExpr] = None
)
List files in the dataset as a streaming iterator. Parameters
ParameterType
fieldsOptional[Sequence[str]]
filterOptional[query.LogicalExpr]
Returns DatasetListIterator

AsyncDatasetsClient

Asynchronous client for managing datasets. Methods

get()

get(self, dataset_name: str) -> Awaitable[Dataset]
Get information about a specific dataset asynchronously. Parameters
ParameterType
dataset_namestr
Returns Awaitable[Dataset]

list()

list(self) -> Awaitable[list[Dataset]]
List all datasets asynchronously. Returns Awaitable[list[Dataset]]

create()

create(self, dataset_name: str) -> Awaitable[Dataset]
Create a new dataset asynchronously. Parameters
ParameterType
dataset_namestr
Returns Awaitable[Dataset]

update()

update(self, dataset_name: str, description: Optional[str] = None) -> Awaitable[Dataset]
Update dataset properties. Parameters
ParameterType
dataset_namestr
descriptionOptional[str]
Returns Awaitable[Dataset]

delete()

delete(self, dataset_name: str) -> Awaitable[None]
Delete a dataset asynchronously. Parameters
ParameterType
dataset_namestr
Returns Awaitable[None]

AsyncDatasetClient

Asynchronous client for dataset operations. Methods

upsert_file()

upsert_file(
   self,
   doc_id: str,
   input: os.PathLike[Any] | Tuple[str, bytes, str],
   metadata: Mapping[str, Any]
)
Upsert a file to the dataset asynchronously. Returns the processing handle. Parameters
ParameterType
doc_idstr
inputos.PathLike[Any] | Tuple[str, bytes, str]
metadataMapping[str, Any]
Returns Awaitable[str]

get_metadata()

get_metadata(
   self,
   ids: Sequence[str],
   fields: Optional[Sequence[str]] = None
)
Get metadata for one or more documents asynchronously. Parameters
ParameterType
idsSequence[str]
fieldsOptional[Sequence[str]]
Returns Awaitable[dict[str, dict[str, Any]]]

update_metadata()

update_metadata(self, doc_id: str, metadata: Mapping[str, Any]) -> Awaitable[str]
Update metadata for a file asynchronously. Returns the processing handle. Parameters
ParameterType
doc_idstr
metadataMapping[str, Any]
Returns Awaitable[str]

delete()

delete(self, doc_id: str) -> Awaitable[str]
Delete a file from the dataset asynchronously. Returns the processing handle. Parameters
ParameterType
doc_idstr
Returns Awaitable[str]

check_handle()

check_handle(self, handle: str) -> Awaitable[bool]
Return whether the handle has been processed asynchronously. Parameters
ParameterType
handlestr
Returns Awaitable[bool]

wait_for_handle()

wait_for_handle(
   self,
   handle: str,
   config: Optional[WaitConfig | dict[str, Any]] = None
)
Poll until a handle has been processed or the timeout is reached asynchronously. Raises an error if the handle is not processed within the configured timeout. Parameters
ParameterType
handlestr
configOptional[WaitConfig | dict[str, Any]]
Returns Awaitable[None]

list()

list(
   self,
   fields: Optional[Sequence[str]] = None,
   filter: Optional[query.LogicalExpr] = None
)
List files in the dataset as a streaming async iterator. Parameters
ParameterType
fieldsOptional[Sequence[str]]
filterOptional[query.LogicalExpr]
Returns AsyncDatasetListIterator

Source

Represents a dataset with an optional filter. Properties
PropertyType
datasetstr
filterOptional[query.LogicalExpr]

Fact

Represents a fact in an ask response. Properties
PropertyType
factstr
ref_idslist[str]

Chunk

Text chunk content. Properties
PropertyType
textstr
doc_pageslist[int]

Image

Image content. Properties
PropertyType
databytes
mime_typestr

Page

Page content with optional image. Properties
PropertyType
page_numberint
imageOptional[Image]

Content

Content in a search result. One of chunk, page, or image. Properties
PropertyType
typeLiteral[‘chunk’, ‘page’, ‘image’]
dataChunk | Page | Image

SearchResult

Represents a search result in an ask response. Properties
PropertyType
doc_idstr
doc_typestr
datasetstr
content_idstr
doc_namestr
contentOptional[Content]
metadatadict[str, Any]

Answer

Represents a final answer in an ask response. Properties
PropertyType
factslist[Fact]
refsdict[str, SearchResult]
confidencefloat

Progress

Represents a progress update in an ask response. Properties
PropertyType
updatestr

AskIterator

Iterator for synchronous ask responses.

AsyncAskIterator

Iterator for asynchronous ask responses.

SearchIterator

Iterator for synchronous search responses.

AsyncSearchIterator

Iterator for asynchronous search responses.

DatasetListIterator

Iterator for synchronous dataset list responses.

AsyncDatasetListIterator

Iterator for asynchronous dataset list responses.

ConsistencyLevel

Enumeration of consistency levels for operations. Values
ValueDescription
Indexedindexed
Strongstrong

WaitConfig

Configuration for polling when waiting for a handle to be processed. Properties
PropertyType
frequency_secsOptional[int]How often to poll for the handle status in seconds. Default is 5.
timeout_secsOptional[int]Maximum time to wait before returning a timeout error in seconds. Default is 300.
Methods Constructor
WaitConfig(frequency_secs: Optional[int] = None, timeout_secs: Optional[int] = None) -> None
Parameters
ParameterType
frequency_secsOptional[int]
timeout_secsOptional[int]

RetryConfig

Configuration for retry behavior. By default, retries occur in two situations:
  1. When the server requests the client to reduce its request rate, resulting in a SlowDownError.
  2. When using the query(..., lsn=N) to wait for writes to be available.
Properties
PropertyType
max_retriesOptional[int]Maximum number of retries to attempt. Default is 3 retries.
timeoutOptional[int]The total timetout for the retry chain in milliseconds. Default is 30,000 milliseconds (30 seconds)
backoffOptional[BackoffConfig]The backoff configuration for the client.
Methods Constructor
RetryConfig(
   max_retries: Optional[int] = None,
   timeout: Optional[int] = None,
   backoff: Optional[BackoffConfig] = None
)
Parameters
ParameterType
max_retriesOptional[int]
timeoutOptional[int]
backoffOptional[BackoffConfig]

BackoffConfig

Configuration for backoff behavior in retries. Properties
PropertyType
baseOptional[int]The base for the backoff. Default is 2x backoff.
init_backoffOptional[int]The initial backoff in milliseconds. Default is 100 milliseconds.
max_backoffOptional[int]The maximum backoff in milliseconds. Default is 10,000 milliseconds (10 seconds).
Methods Constructor
BackoffConfig(
   base: Optional[int] = None,
   init_backoff: Optional[int] = None,
   max_backoff: Optional[int] = None
)
Parameters
ParameterType
baseOptional[int]
init_backoffOptional[int]
max_backoffOptional[int]