Databricks Data Plane

Overview

🚧

Submit an environment request before continuing

Before creating a Databricks data plane, submit an environment request indicating which data plane you would like enabled. Chat with us through Intercom within Ascend, or file a support ticket by emailing [email protected]. Once your data plane is enabled, you'll receive confirmation your data plane is ready for set up.

Enterprise Security Users:
For AWS S3, please send us a VPC endpoint within your environment request.
For Azure, please include relevant subnet IDs.

Setting up a Databricks data plane is a four-step process. Within Databricks, first, create an all-purpose compute cluster. Second, create a SQL warehouse. Finally generate an access token. Once set up is complete within Databricks, create a new data service within Ascend to run on Databricks.

You’ll need advanced permissions in Databricks Data Science & Engineering and SQL warehouse. For more information on these permissions, see the Databricks requirements for SQL warehouses.

You'll also want a text editor open to temporarily record credentials, endpoints, and server host names when setting up Databricks. This information will be used to configure Ascend to run within Databricks.

Step 1: Create a Databricks all-purpose compute cluster

When creating an all-purpose compute cluster, follow the steps for using the Create button.

Use the following configuration settings:

Parameter

Value

Required

Multi-node

Required

Access mode

No isolation shared

Required

Databricks runtime version

10.4 LTS (Scala 2.12, Spark 3.2.1)

Recommended

Worker type

i3.Xlarge

Recommended

Driver tyle

Same as worker

Recommended

Enable Auto Scaling

True

Required

Enable Auto Scaling local storage

False

Recommended

Terminate after __ minutes of inactivity

30

Recommended

Under Advanced options, click Spark and copy/paste the following into the Spark config:

spark.databricks.delta.schema.autoMerge.enabled true

📘

Record the following for later use:

Once the cluster is created, select the compute cluster name. Under the Configuration tab, select Advanced Options. Select the JDBC/ODBC tab. Record the HTTP Path endpoint.

765765

Step 2: Create a SQL warehouse

When creating an SQL warehouse, follow the steps provided by Databricks. Utilize your best practices standards for Name, Cluster size, and Scaling. We recommend setting Auto Stop to “30 minutes.” Under Advanced options, we recommend setting Spot instance policy to “Cost optimized”.

📘

Record the following for later use:

After the SQL warehouse is created, record the following information from Connection details for use within Ascend (see Figure 3):

  • Server host name
  • HTTP path endpoint
10701070

Step 3: Generate an access token for Ascend

Next, generate a personal access token. When setting the Lifetime, please remember this token will need to be renewed.

📘

Record the following for later use:

After the personal access token is generated, record immediately.

Step 4: Create a Databricks data service in Ascend

From your Ascend dashboard, select +New Data Service.
Enter a name and description, and select CREATE.

Data service settings

After creating the data service, select it from dashboard.
Next, select Connections. Use the following settings:

Parameter

Value

Connection Name

DBX_NAME

Connection Type

Databricks

Access Type

Read-Write

Server Host Name

Paste recorded Databricks SQL Warehouse server host name

Execution Context for SQL Work

Use a Databricks SQL Endpoint

Endpoint ID

Paste recorded Databricks SQL Warehouse HTTP path endpoint

Execution Context for Non-SQL Work (e.g. PySpark transforms)

Use an all-purpose Databricks Cluster

Cluster ID

Paste recorded Databricks cluster JDBC/ODBC endpoint

Check Required Credentials.

  • Select Create a new credential from the drop-down menu.
  • Assign a credential name.
  • Paste the recorded Access Token from Step 3 and click CREATE.

Data plane configuration

  • In the left menu, select Data Plane Configuration.
  • Select the newly create Databricks Connection from the drop-down menu.
  • Select Update.

Creating dataflows and connectors

Once the data plane is configured, you can begin creating data flows. Create Read Connectors for Databricks utilizing the previously configured connection.


Did this page help you?