DuckDB Data Accelerator
To use DuckDB as Data Accelerator, specify duckdb
as the engine
for acceleration.
datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb
Configuration​
The DuckDB accelerator can be configured by providing the following params
:
duckdb_file
: The name for the file to back the DuckDB database. If the file does not exist, it will be created. Only applies ifmode
isfile
.
Configuration params
are provided in the acceleration
section for a data store. Other common acceleration
fields can be configured for DuckDB, see see datasets.
datasets:
- from: spice.ai:path.to.my_dataset
name: my_dataset
acceleration:
engine: duckdb
mode: file
params:
duckdb_file: /my/chosen/location/duckdb.db
Limitations
- The DuckDB accelerator does not support nested lists, or structs with nested structs/lists field types. For example:
- Supported:
SELECT {'x': 1, 'y': 2, 'z': 3}
- Unsupported:
SELECT [['duck', 'goose', 'heron'], ['frog', 'toad']]
SELECT {'x': [1, 2, 3]}
- Supported:
- The DuckDB accelerator does not support enum, dictionary, or map field types. For example:
- Unsupported:
SELECT MAP(['key1', 'key2', 'key3'], [10, 20, 30])
- Unsupported:
- The DuckDB accelerator does not support
Decimal256
(76 digits), as it exceeds DuckDB's maximum Decimal width of 38 digits. - Updating a dataset with DuckDB acceleration while the Spice Runtime is running (hot-reload) will cause the DuckDB accelerator query federation to disable until the Runtime is restarted.
Memory Considerations
When accelerating a dataset using mode: memory
(the default), some or all of the dataset is loaded into memory. Ensure sufficient memory is available, including overhead for queries and the runtime, especially with concurrent queries.
In-memory limitations can be mitigated by storing acceleration data on disk, which is supported by duckdb
and sqlite
accelerators by specifying mode: file
.