TopK documents are JSON-like objects containing key-value pairs.

Upsert function

To upsert documents, pass a list of documents to the upsert() function:
client.collection("books").upsert(
    [
        {
            "_id": "book-1",
            "title": "The Great Gatsby",
            "published_year": 1925,
            "title_embedding": [0.12, 0.67, 0.82, 0.53, ...]
        },
        {
            "_id": "book-2",
            "title": "To Kill a Mockingbird",
            "published_year": 1960,
            "title_embedding": [0.42, 0.53, 0.65, 0.33, ...]
        },
        {
            "_id": "book-3",
            "title": "1984",
            "published_year": 1949,
            "title_embedding": [0.59, 0.33, 0.71, 0.61, ...]
        }
    ]
)
  • Every document must have a string _id field.
  • If a document with the specified _id doesn’t exist, a new document will be inserted.
  • If a document with the same _id already exists, the existing document will be replaced with the new one.
The upsert() function does not perform a partial update or merge - the entire document is being replaced.

Supported types

TopK documents are a flat structure of key-value pairs. The following value types are supported:
TypePython TypeJavaScript TypeHelper Function
Stringstrstring-
Integerintnumber-
Floatfloatnumber-
Booleanboolboolean-
String listlist[str]string[]-
F32 listlist[float]number[]-
F64 listuse helperuse helperf64_list()
I32 listuse helperuse helperi32_list()
I64 listuse helperuse helperi64_list()
U32 listuse helperuse helperu32_list()
F32 vectorlist[float]number[]f32_vector()
U8 vectoruse helperuse helperu8_vector()
Binary vectoruse helperuse helperbinary_vector()
F32 sparse vectoruse helperuse helperf32_sparse_vector()
U8 sparse vectoruse helperuse helperu8_sparse_vector()
Bytesuse helperuse helperbytes()
Here’s an example of a creating a collection with all supported types and inserting a document:

from topk_sdk.schema import (
    int,
    text,
    float,
    bool,
    f32_vector,
    u8_vector,
    binary_vector,
    f32_sparse_vector,
    u8_sparse_vector,
    bytes,
    list,
)

client.collections().create(
    "books",
    schema={
        "title": text(),
        "published_year": int(),
        "price": float(),
        "is_published": bool(),
        "f32_embedding": f32_vector(dimension=1536).index(vector_index(metric="cosine")),
        "u8_embedding": u8_vector(dimension=1536).index(vector_index(metric="euclidean")),
        "binary_embedding": binary_vector(dimension=1536).index(vector_index(metric="hamming")),
        "f32_sparse_embedding": f32_sparse_vector().index(vector_index(metric="dot_product")),
        "u8_sparse_embedding": u8_sparse_vector().index(vector_index(metric="dot_product")),
        "bytes": bytes(),
        "tags": list(value_type="text").index(keyword_index()),
    },
)
Insert a document with all supported types:
from topk_sdk.data import f32_vector, u8_vector, binary_vector, f32_sparse_vector, u8_sparse_vector, bytes

client.collection("books").upsert([
  {
    "_id": "1",
    "title": "The Great Gatsby",
    "published_year": 1925,
    "price": 10.99,
    "is_published": true,
    "f32_embedding": f32_vector([0.12, 0.67, 0.82, 0.53]),
    "u8_embedding": u8_vector([0, 1, 2, 3]),
    "binary_embedding": binary_vector([0, 1, 1, 0]),
    "f32_sparse_embedding": f32_sparse_vector({0: 0.12, 6: 0.67, 17: 0.82, 97: 0.53}),
    "u8_sparse_embedding": u8_sparse_vector({0: 12, 6: 67, 17: 82, 97: 53}),
    "bytes": bytes([0, 1, 1, 0]),
    "tags": ["dream", "illusion", "desire"],
  }
])

Data constructors

See the Data constructors page for details on how to use complex data types in TopK.