Skip to main content

Apache Spark Connector

Apache Spark as a connector for federated SQL query against a Spark Cluster using Spark Connect

datasets:
- from: spark:spiceai.datasets.my_awesome_table
name: my_table
params:
spark_remote: sc://localhost:15002

Configuration​

Auth Examples​

Spark clusters configured to accept authenticated requests should not set spark_remote as an inline dataset param, as it will contain sensitive data. For this case, use the secret replacement syntax to load the secret from a secret store, e.g. ${secrets:my_spark_remote}.

Check Secrets Stores for more details.

SPICE_SPARK_REMOTE=<spark-remote> \
spice run
# Or using the CLI to configure the secrets into an `.env` file
spice login spark --spark_remote <spark-remote>

.env

SPICE_SPARK_REMOTE=<spark-remote>

spicepod.yaml

version: v1beta1
kind: Spicepod
name: spice-app

secrets:
- from: env
name: env

datasets:
- from: spark:spiceai.datasets.my_awesome_table
name: my_table
params:
spark_remote: ${env:SPICE_SPARK_REMOTE}

Learn more about Env Secret Store.