Skip to main content



A Spicepod is a package that encapsulates application-centric datasets and machine learning (ML) models.

Spicepods are analogous to code packaging systems, like NPM, however differ by expanding the concepts to data and ML models.


A Spicepod is described by a YAML manifest file, typically named spicepod.yaml, which includes the following key sections:

  • Metadata: Basic information about the Spicepod, such as its name and version.
  • Datasets: Definitions of datasets that are used or produced within the Spicepod.
  • Catalogs: Definitions of catalogs that are used within the Spicepod.
  • Models: Definitions of ML models that the Spicepod manages, including their sources and associated datasets.
  • Secrets: Configuration for any secret stores used within the Spicepod.

Example Manifest

version: v1beta1
kind: Spicepod
name: my_spicepod

- from:
name: qs
enabled: true
refresh_mode: append

- from: file://model_path.onnx
name: my_model
- qs

store: env

Key Components


Datasets in a Spicepod can be sourced from various locations, including local files or remote databases. They can be materialized and accelerated using different engines such as DuckDB, SQLite, or PostgreSQL to optimize performance.

Learn more at Datasets.


Catalogs in a Spicepod can contain multiple schemas. Each schema, in turn, contains multiple tables where the actual data is stored.

Learn more at Catalogs.


ML models are integrated into the Spicepod similarly to datasets. The models can be specified using paths to local files or remote locations. ML inference can be performed using the models and datasets defined within the Spicepod.

Learn more at Models.

Secrets supports various secret stores to manage sensitive information such as API keys or database credentials. Supported secret store types include environment variables, files, AWS Secrets Manager, Kubernetes secrets, and keyrings.

Learn more at Secret Stores