Documentation Index
Fetch the complete documentation index at: https://docs.topk.io/llms.txt
Use this file to discover all available pages before exploring further.
Dataset
Dataset is a high-level abstraction for storing, searching, and getting grounded answers from unstructured files (pdf, markdown, html, …). Document files uploaded into a dataset are parsed into context-aware chunks with relevant metadata and stored inside a collection. This gives our query sub-agent the ability to retrieve the most relevant context and generate a highly accurate, grounded answer.Collection
Collection is a low-level abstraction for storing JSON-like documents with indexed fields. Collections have an opt-in schema that defines required and optional fields, field data types, and field indexes. Every document stored in a collection must have an_id field as a unique primary key.
Indexes
Documents inside a collection can have multiple indexed fields defined in the collection schema. Field indexes enable efficient retrieval of documents based on dense and sparse vector embeddings, multi-vector embeddings (late interaction), keywords (BM25), semantic similarity, and their combinations.The ability to store and search multiple indexed fields per document minimizes storage overhead, makes filtering more efficient, and gives users flexibility at query time without having to re-ingest their data or maintain multiple indexes.
Filtering
Filtering allows queries to select only documents that match a specific condition. Filter expressions can be simple (comparison) or complex (AND/OR operators, ANY/ALL operators, regex patterns, and more). Additionally, you can use computed fields (for example, similarity score) to filter documents in the result set.Filters are always applied before sorting and aggregation (top-k) to guarantee that the final results contain every document matching the filter, even if there is just one. We also guarantee that recall stays the same (or improves) when using higly selective filters.
Custom Scoring
Similarity is not the same as relevance. Custom scoring expressions enable relevance tuning (for example, boosting more recent documents) inside the query without having to over-fetch results and re-score them in the application layer. You can combine computed fields (similarity score, recency score, etc.) with any metadata fields (for example, source quality) inside your scoring expression to define your ranking.Scoring expressions are always computed before sorting and aggregation to guarantee that the final results contain the most relevant documents according to your ranking logic.
Compute-Storage Separation
All data in TopK is durably stored in object storage. Read/write compute nodes are statless which means that any node can immediately take over serving requests in case of a node failure. This decoupled architecture enables cost-effective scaling and high availability without having to run consensus-based replication (Raft or Paxos) inside the cluster.Read-Write Separation
Different applications have different read/write patterns and latency requirements. We designed our system with decoupled read and write paths to minimize the impact of write/indexing/compaction operations on query performance. This enables sustained high-throughput writes without query latency spikes caused by background compaction or indexing contenting for resources on the same node.Read Consistency
TopK supports three different consistency levels, allowing you to choose the right trade-off between consistency, performance, and cost. By default, we provide a Balanced Consistency Mode, which balances data freshness (~750ms p99 write-to-queryable) and query efficiency for most applications. Below, we explain how TopK handles data writes and reads and how each consistency mode impacts behavior.Balanced Consistency (Default)
Reads in this mode consider both indexed files and the most recent writes. While there may be a small delay of less than a second for some recent writes to appear, this mode offers lower cost compared to strong consistency. It is ideal for most real-world applications where near-real-time updates are sufficient.- The Router checks both compacted files and a cached view of the WAL (refresh rate is less than 1s)
- This introduces a chance of delay: if a write has just been added to WAL but hasn’t been cached yet, it may not show up in a read
- However, this delay is minimal (less than 1s in most cases), making it a practical and efficient default
Indexed Consistency
Reads in this mode only consider fully compacted files and ignore recent WAL writes to deliver constatnly low query latency. This is best suited for workloads with asynchronous write path that are not sensitive to recent writes being visible in queries with low delay.- The Router forwards queries only to the Executor, which reads from compacted files
- WAL is ignored, meaning queries are always served from stable, processed data
- This reduces query latency and load, making it the most cost-efficient option for high-throughput reads
Strong Consistency
Reads in this mode always return the latest writes before responding. While this ensures that all queries see the most recent updates, it comes with higher latency and cost due to additional WAL reads. This mode recommended for cases where clients always need to see the most recent writes.- Before serving a read, the Router explicitly checks the WAL to ensure the latest writes are reflected
- This guarantees that all queries see the most recent updates but adds overhead because it requires an additional lookup
- Strong consistency ensures that all queries see the most recent updates but is more expensive than other modes due to the extra computation
Choosing the Right Mode
| Consistency Mode | Freshness | Cost | Query performance |
|---|---|---|---|
| Balanced (Default) | Near real-time (less than 1s) | Low | Good |
| Indexed | Only compacted data | Low | Fastest |
| Strong | All writes are visible | Higher | Slower |
LSN-based Consistency
For even more precise control over consistency, TopK also supports LSN (Log Sequence Number) based consistency. This approach allows you to ensure read-after-write consistency by specifying the exact sequence number of a write operation in your queries.For detailed information about using LSNs in queries, see our LSN-based Consistency guide in the Query documentation.
Multi-tenancy
In TopK, multi-tenancy is achieved through prefixing the tenant IDs in documents_id value.
This design enables TopK to scale a single collection efficiently without performance degradation by leveraging smart sharding on document ID.
Storing documents for a specific tenant
To store documents for a specific tenant, prepend the tenant ID to the document_id value:
Querying documents for a specific tenant
To query documents for a specific tenant, use the tenant ID along with thestartsWith() filter:
The approach preserves the ability to query across all tenants when required.