ML Model Serving

Spice supports loading and serving ONNX models and GGUF LLMs from various sources for embeddings and inference, including local filesystems, Hugging Face, and the Spice Cloud platform.

Example: Loading a LLM from Hugging Face

models:
  - name: llama_3.2_1B
    from: huggingface:huggingface.co/meta-llama/Llama-3.2-1B
    params:
      hf_token: ${ secrets:HF_TOKEN }

Example spicepod.yml loading an ONNX model from HuggingFace:

models:
  - from: huggingface:huggingface.co/spiceai/darts:latest
    name: hf_model
    files:
      - path: model.onnx
    datasets:
      - taxi_trips

Filesystem

Models can be hosted on a local filesystem and referenced directly in the configuration. For more details, see the Filesystem Model Component.

Hugging Face

Spice integrates with Hugging Face, enabling you to use a wide range of pre-trained models. For more information, see the Hugging Face Model Component.

Spice Cloud Platform

The Spice Cloud platform provides a scalable environment for training, hosting, and managing your models. For further details, see the Spice Cloud Platform Model Component.

Example: Loading a LLM from Hugging Face​

Filesystem​

Hugging Face​

Spice Cloud Platform​

Example: Loading a LLM from Hugging Face

Filesystem

Hugging Face

Spice Cloud Platform