Skip to main content

POST /v1/search

Performs a basic vector similarity search across one or more datasets.

Request Body

  • datasets (array of strings, Optional): Names of the dataset components to perform the similarity search on. Each dataset must have exactly one column augmented with an embedding. If None, all available datasets are used.
  • text (string): Query plaintext used to retrieve similar rows from the underlying datasets listed in the from request key.
  • limit (integer): The number of rows to return. Default: 3.
  • where (string): An SQL filter predicate to apply within the search.
  • additional_columns (array of strings): Additional columns, from the datasets, to return in the response (under .matches[*].metadata).

Example​

Spicepod

embeddings:
- name: embedding_maker
from: openai

datasets:
- name: app_messages
from: file://my.csv
columns:
- name: document_text
embeddings:
- from: embedding_maker

Request

curl -XPOST http://localhost:8090/v1/search \
-d '{
"datasets": ["app_messages"],
"text": "Tokyo plane tickets",
"where": 'user=1234321',
"additional_columns": ["timestamp"],
"limit": 3
}'

Response

{
"matches": [
{
"value": "I booked use some tickets",
"dataset": "app_messages",
"primary_key": { "id": "6fd5a215-0881-421d-ace0-b293b83452b5" },
"metadata": { "timestamp": 1724716542 }
},
{
"value": "direct to Narata",
"dataset": "app_messages",
"primary_key": { "id": "8a25595f-99fb-4404-8c82-e1046d8f4c4b" },
"metadata": { "timestamp": 1724715881 }
},
{
"value": "Yes, we're sitting together",
"dataset": "app_messages",
"primary_key": { "id": "8421ed84-b86d-4b10-b4da-7a432e8912c0" },
"metadata": { "timestamp": 1724716123 }
}
],
"duration_ms": 42
}

The v1/search endpoint supports chunked embedding columns.