Query documents
Use the query
method to search for documents.
TopK allows you to query documents using a SQL-inspired syntax. It comes with semantic search, text search, vector search, and filtering capabilities out of the box.
TopK’s declarative query builder allows you to select fields, chain filters, apply vector/text search in a composable way.
Semantic search
To perform a semantic search on your documents, use the semantic_similarity
function. Query you want to search for is automatically embedded using the model
specified in the semantic_index
definition.
Text search
To perform a text search on your documents, use the match
function. The match
function will by default execute against all
fields with a keyword_index
defined on them. You can provide field
parameter to specify a field to match against.
To use the match
predicate, the field must have a keyword_index
defined on it.
The above example takes a collection of books and filters to documents that have the Great
keyword in the title
field.
Vector search
To perform a vector search on your documents, use the vector_distance
method. This computes the distance between the provided query vector and vectors
stored inside the database according to the metric specified in the vector_index
definition.
The above example computes the cosine similarity between the vector and the title_embedding
field. The title_embedding
field is a vector field that represents the title of the book.
Filtering
You can filter documents based on metadata, keywords, values computed inside select()
(e.g. vector similarity), and more. Filter expressions
support all comparison operators: ==
, !=
, >
, >=
, <
, <=
, arithmetic operations: +
, -
, *
, /
, and boolean operators: |
and &
.
Metadata filtering
The example below shows filtering by metadata.
Keyword filtering
The example below shows filtering documents by the keywords they contain. It’ll match all documents that
contain either gatsby
or both catcher
and rye
.
Filter expressions
You can combine multiple filter expressions using boolean operators. For example, the query below will match all documents that were published in 1997 or between 1920 and 1980.
starts_with
The starts_with
operator can be used on string fields to match documents that start with a given prefix. This is especially
useful in multi-tenant applications where document IDs can be structured as {tenant_id}/{document_id}
and starts_with
can
then be used to scope the query to a specific tenant.
Results collection
All queries are required to have a collection stage. Currently, we only support top_k
and count
collectors.
top_k
You can use the top_k
method to return the top k
results. The top_k
method takes a field, the number of results to return, and an optional asc
parameter to sort the results in ascending order.
count
You can use the count
method to get the total number of documents matching the query. If there are no filters then count
will return the total number of documents in the collection.
Reranking
You can rerank the results of a query using the rerank
method. The rerank
method takes:
model
(optional): The model to use for reranking. If not specified, uses the default model.query
(optional): The query text to rerank against. Uses arguments fromsemantic_similarity
if not specified.fields
(optional): List of fields to use for reranking. Uses fields fromsemantic_similarity
if not specified.topk_multiple
(optional): Multiple of top-k to rerank. For example, iftop_k=10
andtopk_multiple=2
, reranker takes 20 results from the original query and returns the top 10 results.
Supported models:
cohere/rerank-v3.5
Putting it all together
You can combine text search, vector search, and filtering in a single query. Some people refer to this as a hybrid search and we believe it can provide you with the best and most relevant search results.
Notes on writing queries
When writing queries, remember that they all require the top_k
or count
method at the end.
Next steps
Need support or want to give some feedback? You can join our community or drop us an email at support@topk.io.
Was this page helpful?