Databricks Data Connector
Databricks as a connector for federated SQL query against Databricks using Spark Connect or directly from Delta Tables in S3.
Configuration​
spice login databricks
can be used to configure the Databricks access token for the Spice runtime.
Parameters​
endpoint
: The endpoint of the Databricks instance.mode
: The execution mode for querying against Databricks. The default isspark_connect
. Possible values:spark_connect
: Use Spark Connect to query against Databricks.s3
: Query directly from Delta Tables in S3.
format
: The format of the data to query. The default isdeltalake
. Only valid whenmode
iss3
. Possible values:deltalake
: Query Delta Tables.
databricks-cluster-id
: The ID of the compute cluster in Databricks to use for the query. Only valid whenmode
isspark_connect
.
Auth​
An active personal access token for the Databricks instance is required (equivalent to DATABRICKS_TOKEN
).
By default the Databricks connector will look for a secret named databricks
with keys token
.
Check Secrets Stores for more details.
- Local
- Env
- Kubernetes
- Keyring
spice login databricks --token <access-token>
Learn more about File Secret Store.
SPICE_SECRET_DATABRICKS_TOKEN=<access-token> \
spice run
spicepod.yaml
version: v1beta1
kind: Spicepod
name: spice-app
secrets:
store: env
# <...>
Learn more about Env Secret Store.
kubectl create secret generic databricks \
--from-literal=token='<access-token>'
spicepod.yaml
version: v1beta1
kind: Spicepod
name: spice-app
secrets:
store: kubernetes
# <...>
Learn more about Kubernetes Secret Store.
Add new keychain entry (macOS), with user and password in JSON string
security add-generic-password -l "Databricks Secret" \
-a spiced -s spice_secret_databricks \
-w $(echo -n '{"token": "<access-token>"}')
spicepod.yaml
version: v1beta1
kind: Spicepod
name: spice-app
secrets:
store: keyring
# <...>
Learn more about Keyring Secret Store.
Example​
datasets:
- from: databricks:spiceai.datasets.my_awesome_table // A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
databricks-cluster-id: 1234-567890-abcde123