Search your documents and return the most relevant passages based on your natural-language query.Documentation Index
Fetch the complete documentation index at: https://docs.topk.io/llms.txt
Use this file to discover all available pages before exploring further.
How it works
When you run Search, TopK:Searches across your documents
Searches one or more datasets for the passages most relevant to the query.
Usage
Once your documents are processed, you can start retrieving relevant passages immediately.- CLI
Python SDK
JavaScript SDK
Query:What was the total net income of Bank of America in 2024?Search results:
- 1 Condensed Statement of Cash Flows showing net income of $27,132m (2024) vs $26,515m (2023) bank_of_america_2024.pdf p. 170
- 2 Consolidated Statement of Comprehensive Income: net income line item for 2024–2022 bank_of_america_2024.pdf p. 92
- 3 Supporting figure from the filing (tabular financial excerpt) boa-ask-ref-3-figure.jpg
- 4 Key performance indicators—selected annual financial data (including net income) bank_of_america_2024.pdf pp. 33–36
- 5 Segment results tying to total-corporation net income bank_of_america_2024.pdf pp. 166–168
- 6 Executive summary—summary income statement and balance sheet excerpts bank_of_america_2024.pdf pp. 29–30
Understanding search results
Search returns a list of the most relevant Search Results based on the provided query. Search Results are objects referencing specific passages extracted from the original documents.Search Result
Each Search Result has the following fields:| Field | Type | Description |
|---|---|---|
doc_id | string | The ID of the source document assigned at upload time |
doc_name | string | The file name of the source document assigned at upload time |
doc_type | string | The MIME type of the source document (e.g. application/pdf) |
dataset | string | The dataset the document belongs to |
content_id | string | A unique identifier for the content of this Search Result |
content | object | The matched content — see Content types |
metadata | object | Metadata fields attached to the source document at upload time — must be requested, see Retrieving metadata |
Content types
There are three content types a search result can contain:- Chunk — a text passage extracted from a document, with optional source page number(s)
- Image — an image extracted from a document
- Page — an image of a page from a document, with source page number
Scoping the search
Query across multiple datasets or apply document filters to narrow the scope of the query.Scoping to specific datasets
This is useful when you want:- More targeted results
- Less ambiguity across unrelated document sets
- Tighter control over what content an agent is allowed to see
- CLI
Python SDK
JavaScript SDK
Use
-d / --dataset (repeatable):Filter documents
Sometimes a dataset might contain documents that should not be considered for the query. You can filter out documents that don’t match your criteria by providing a filter expression. These filter expressions operate on the metadata fields of documents. If your documents include metadata fields, you can use those fields to narrow down the search scope. This is useful when you want to query:- Documents within a specific time range
- Documents matching a particular category or type
- Documents associated with a specific group or owner
- Documents the end user is permitted to access
Python SDK
JavaScript SDK
Retrieving metadata
By default, metadata fields are not included in search results. Pass the field names you want returned and they will appear on each Search Result.- CLI
Python SDK
JavaScript SDK
Use
--field (repeatable) and --output json to include metadata field(s) in the output:Example output