TopK documents are JSON-like objects containing key-value pairs.

Upsert function

To upsert documents, pass a list of documents to the upsert() function:

client.collection("books").upsert(
    [
        {
            "_id": "book-1",
            "title": "The Great Gatsby",
            "published_year": 1925,
            "title_embedding": [0.12, 0.67, 0.82, 0.53, ...]
        },
        {
            "_id": "book-2",
            "title": "To Kill a Mockingbird",
            "published_year": 1960,
            "title_embedding": [0.42, 0.53, 0.65, 0.33, ...]
        },
        {
            "_id": "book-3",
            "title": "1984",
            "published_year": 1949,
            "title_embedding": [0.59, 0.33, 0.71, 0.61, ...]
        }
    ]
)
  • Every document must have a string _id field.
  • If a document with the specified _id doesn’t exist, a new document will be inserted.
  • If a document with the same _id already exists, the existing document will be replaced with the new one.

The upsert() function does not perform a partial update or merge - the entire document is being replaced.

Supported types

TopK documents are a flat structure of key-value pairs.

The following value types are supported:

TypePython TypeJavaScript TypeHelper Function
Stringstrstring-
Integerintnumber-
Floatfloatnumber-
Booleanboolboolean-
F32 vectorList[float]number[]f32_vector()
U8 vectoruse helperuse helperu8_vector()
Binary vectoruse helperuse helperbinary_vector()
F32 sparse vectoruse helperuse helperf32_sparse_vector()
U8 sparse vectoruse helperuse helperu8_sparse_vector()
Bytesuse helperuse helperbytes()

Here’s an example of a creating a collection with all supported types and inserting a document:


from topk_sdk.schema import (
    int,
    text,
    float,
    bool,
    f32_vector,
    u8_vector,
    binary_vector,
    f32_sparse_vector,
    u8_sparse_vector,
    bytes,
)

client.collections().create(
    "books",
    schema={
        "title": text(),
        "published_year": int(),
        "price": float(),
        "is_published": bool(),
        "f32_embedding": f32_vector(dimension=1536).index(vector_index(metric="cosine")),
        "u8_embedding": u8_vector(dimension=1536).index(vector_index(metric="euclidean")),
        "binary_embedding": binary_vector(dimension=1536).index(vector_index(metric="hamming")),
        "f32_sparse_embedding": f32_sparse_vector().index(vector_index(metric="dot_product")),
        "u8_sparse_embedding": u8_sparse_vector().index(vector_index(metric="dot_product")),
        "bytes": bytes(),
    },
)

Insert a document with all supported types:

from topk_sdk.data import f32_vector, u8_vector, binary_vector, f32_sparse_vector, u8_sparse_vector, bytes

client.collection("books").upsert([
  {
    "_id": "1",
    "title": "The Great Gatsby",
    "published_year": 1925,
    "price": 10.99,
    "is_published": true,
    "f32_embedding": f32_vector([0.12, 0.67, 0.82, 0.53]),
    "u8_embedding": u8_vector([0, 1, 2, 3]),
    "binary_embedding": binary_vector([0, 1, 1, 0]),
    "f32_sparse_embedding": f32_sparse_vector({0: 0.12, 6: 0.67, 17: 0.82, 97: 0.53}),
    "u8_sparse_embedding": u8_sparse_vector({0: 12, 6: 67, 17: 82, 97: 53}),
    "bytes": bytes([0, 1, 1, 0]),
  }
])

Helper functions

See the Helper functions page for details on how to use vector and bytes helper functions in TopK.

TopK documents are JSON-like objects containing key-value pairs.

Upsert function

To upsert documents, pass a list of documents to the upsert() function:

client.collection("books").upsert(
    [
        {
            "_id": "book-1",
            "title": "The Great Gatsby",
            "published_year": 1925,
            "title_embedding": [0.12, 0.67, 0.82, 0.53, ...]
        },
        {
            "_id": "book-2",
            "title": "To Kill a Mockingbird",
            "published_year": 1960,
            "title_embedding": [0.42, 0.53, 0.65, 0.33, ...]
        },
        {
            "_id": "book-3",
            "title": "1984",
            "published_year": 1949,
            "title_embedding": [0.59, 0.33, 0.71, 0.61, ...]
        }
    ]
)
  • Every document must have a string _id field.
  • If a document with the specified _id doesn’t exist, a new document will be inserted.
  • If a document with the same _id already exists, the existing document will be replaced with the new one.

The upsert() function does not perform a partial update or merge - the entire document is being replaced.

Supported types

TopK documents are a flat structure of key-value pairs.

The following value types are supported:

TypePython TypeJavaScript TypeHelper Function
Stringstrstring-
Integerintnumber-
Floatfloatnumber-
Booleanboolboolean-
F32 vectorList[float]number[]f32_vector()
U8 vectoruse helperuse helperu8_vector()
Binary vectoruse helperuse helperbinary_vector()
F32 sparse vectoruse helperuse helperf32_sparse_vector()
U8 sparse vectoruse helperuse helperu8_sparse_vector()
Bytesuse helperuse helperbytes()

Here’s an example of a creating a collection with all supported types and inserting a document:


from topk_sdk.schema import (
    int,
    text,
    float,
    bool,
    f32_vector,
    u8_vector,
    binary_vector,
    f32_sparse_vector,
    u8_sparse_vector,
    bytes,
)

client.collections().create(
    "books",
    schema={
        "title": text(),
        "published_year": int(),
        "price": float(),
        "is_published": bool(),
        "f32_embedding": f32_vector(dimension=1536).index(vector_index(metric="cosine")),
        "u8_embedding": u8_vector(dimension=1536).index(vector_index(metric="euclidean")),
        "binary_embedding": binary_vector(dimension=1536).index(vector_index(metric="hamming")),
        "f32_sparse_embedding": f32_sparse_vector().index(vector_index(metric="dot_product")),
        "u8_sparse_embedding": u8_sparse_vector().index(vector_index(metric="dot_product")),
        "bytes": bytes(),
    },
)

Insert a document with all supported types:

from topk_sdk.data import f32_vector, u8_vector, binary_vector, f32_sparse_vector, u8_sparse_vector, bytes

client.collection("books").upsert([
  {
    "_id": "1",
    "title": "The Great Gatsby",
    "published_year": 1925,
    "price": 10.99,
    "is_published": true,
    "f32_embedding": f32_vector([0.12, 0.67, 0.82, 0.53]),
    "u8_embedding": u8_vector([0, 1, 2, 3]),
    "binary_embedding": binary_vector([0, 1, 1, 0]),
    "f32_sparse_embedding": f32_sparse_vector({0: 0.12, 6: 0.67, 17: 0.82, 97: 0.53}),
    "u8_sparse_embedding": u8_sparse_vector({0: 12, 6: 67, 17: 82, 97: 53}),
    "bytes": bytes([0, 1, 1, 0]),
  }
])

Helper functions

See the Helper functions page for details on how to use vector and bytes helper functions in TopK.