Skip to main content

Databricks Data Connector

Databricks as a connector for federated SQL query against Databricks using Spark Connect or directly from Delta Tables in S3.

Configuration​

spice login databricks can be used to configure the Databricks access token for the Spice runtime.

Parameters​

  • endpoint: The endpoint of the Databricks instance.
  • mode: The execution mode for querying against Databricks. The default is spark_connect. Possible values:
    • spark_connect: Use Spark Connect to query against Databricks.
    • s3: Query directly from Delta Tables in S3.
  • format: The format of the data to query. The default is deltalake. Only valid when mode is s3. Possible values:
    • deltalake: Query Delta Tables.
  • databricks-cluster-id: The ID of the compute cluster in Databricks to use for the query. Only valid when mode is spark_connect.

Auth​

An active personal access token for the Databricks instance is required (equivalent to DATABRICKS_TOKEN).

By default the Databricks connector will look for a secret named databricks with keys token.

Check Secrets Stores for more details.

spice login databricks --token <access-token>

Learn more about File Secret Store.

Example​

datasets:
- from: databricks:spiceai.datasets.my_awesome_table // A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
databricks-cluster-id: 1234-567890-abcde123