Skip to main content

Spice.ai OSS and the Spice.ai Cloud Platform

You can use any number of predefined datasets available from the Spice.ai Cloud Platform in the Spice runtime.

A list of publically available datasets from Spice.ai can be found here: https://docs.spice.ai/building-blocks/datasets.

In order to access public datasets from Spice, you will first need to create an account with Spice.ai by selecting the free tier membership.

Navigate to spice.ai and create a new account by clicking on Try for Free.

spiceai_try_for_free-1

After creating an account, you will need to create an app in order to create to an API key.

create_app-1

You will now be able to access datasets from the Spice.ai Platform. For this quickstart, we will be using the eth.recent_blocks dataset.

Step 1. Log in and authenticate from the command line using the spice login command. A pop up browser window will prompt you to authenticate:

spice login

Step 2. Initialize a new project and start the runtime:

# Initialize a new Spice app
spice init spice_app

# Change to app directory
cd spice_app

# Start the runtime
spice run

Step 3. Configure the dataset:

In a new terminal window, configure a new dataset using the spice dataset configure command:

spice dataset configure

Enter a dataset name that will be used to reference the dataset in queries. This name does not need to match the name in the dataset source.

dataset name: (spice_app) eth_recent_blocks

Enter the description of the dataset:

description: Recent Ethereum blocks

Enter the location of the dataset:

from: spice.ai/eth.recent_blocks

Select y when prompted whether to accelerate the data:

Locally accelerate (y/n)? y

You should see the following output from your runtime terminal:

2024-02-21T22:49:10.038461Z  INFO runtime: Loaded dataset: eth_recent_blocks

Step 4. In a new terminal window, use the Spice SQL REPL to query the dataset

spice sql
SELECT number, size, gas_used from eth_recent_blocks LIMIT 10;

The output displays the results of the query along with the query execution time:

+----------+--------+----------+
| number | size | gas_used |
+----------+--------+----------+
| 19281345 | 400378 | 16150051 |
| 19281344 | 200501 | 16480224 |
| 19281343 | 97758 | 12605531 |
| 19281342 | 89629 | 12035385 |
| 19281341 | 133649 | 13335719 |
| 19281340 | 307584 | 18389159 |
| 19281339 | 89233 | 13391332 |
| 19281338 | 75250 | 12806684 |
| 19281337 | 100721 | 11823522 |
| 19281336 | 150137 | 13418403 |
+----------+--------+----------+

Query took: 0.004057791 seconds

You can experiment with the time it takes to generate queries when using non-accelerated datasets. You can change the acceleration setting from true to false in the datasets.yaml file.