Embedding Models
Embedding models convert raw text into numerical representations that can be used by machine learning models. Spice supports running embedding models locally or using remote services such as OpenAI or la Plateforme.
Embedding models are defined in the spicepod.yaml
file as top-level components.
Example configuration in spicepod.yaml
:
embeddings:
- from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2
name: all_minilm_l6_v2
- from: openai:text-embedding-3-large
name: xl_embed
params:
openai_api_key: ${ secrets:SPICE_OPENAI_API_KEY }
- name: my_model
from: file:model.safetensors
files:
- path: config.json
- path: models/embed/tokenizer.json
Embedding models can be used either by:
- An OpenAI-compatible endpoint
- By augmenting a dataset with column-level embeddings, to provide vector-based search functionality.