Skip to main content

Databricks Data Connector

Databricks as a connector for federated SQL query against Databricks using Spark Connect or directly from Delta Tables in S3.

Configuration

spice login databricks can be used to configure the Databricks access token for the Spice runtime.

Parameters

  • endpoint: The endpoint of the Databricks instance.
  • mode: The execution mode for querying against Databricks. The default is spark_connect. Possible values:
    • spark_connect: Use Spark Connect to query against Databricks.
    • s3: Query directly from Delta Tables in S3.
  • format: The format of the data to query. The default is deltalake. Only valid when mode is s3. Possible values:
    • deltalake: Query Delta Tables.
  • databricks-cluster-id: The ID of the compute cluster in Databricks to use for the query. Only valid when mode is spark_connect.

Auth

An active personal access token for the Databricks instance is required (equivalent to DATABRICKS_TOKEN).

By default the Databricks connector will look for a secret named databricks with keys token.

Check Secrets Stores for more details.

spice login databricks --token <access-token>

Learn more about File Secret Store.

Example

datasets:
- from: databricks:spiceai.datasets.my_awesome_table // A reference to a table in the Databricks unity catalog
name: my_delta_lake_table
params:
endpoint: dbc-a1b2345c-d6e7.cloud.databricks.com
databricks-cluster-id: 1234-567890-abcde123