Embeddings
Embeddings allow you to convert text or other data into vector representations, which can be used for various machine learning and natural language processing tasks.
embeddings
​
The embeddings
section in your configuration allows you to specify one or more embedding models to be used with your datasets.
Example:
embeddings:
- from: huggingface:huggingface.co/sentence-transformers/all-MiniLM-L6-v2:latest
name: text_embedder
params:
max_length: "128"
datasets:
- my_text_dataset
from
​
The from
field specifies the source of the embedding model. It supports the following prefixes:
huggingface:huggingface.co
- Models from Hugging Facefile:
- Local file pathsopenai
- OpenAI models
Follows the same convention as models.from
.
name
​
A unique identifier for this embedding component.
files
​
Optional. A list of files associated with this model. Each file has:
path
: The path to the filename
: Optional. A name for the filetype
: Optional. The type of the file (automatically determined if not specified)
Follows the same convention as models.files
.
params
​
Optional. A map of key-value pairs for additional parameters specific to the embedding model.
dependsOn
​
Optional. A list of dependencies that must be loaded and available before this embedding model.