Skip to main content

Setup Zuora Connector for Databricks

Zuora

Setup Zuora Connector for Databricks

Provides detailed instructions for setting up the Zuora Connector for Databricks.

Configure Your Databricks Destination

To configure your Databricks destination within the Zuora Connector you must first set up service account credentials, manage permissions, configure dataset access, and other necessary configurations to ensure a seamless data transfer process from Zuora to Databricks.

Prerequisites

This Databricks integration uses Unity Catalog data governance features by default. You will need to enable Unity Catalog on your Databricks Workspace.

Step 1: Create a SQL warehouse

Create a new SQL warehouse for data writing

  1. Log in to your Databricks account.
  2. In the navigation pane, click into the workspace dropdown and select SQL.
  3. In the SQL console, in the SQL navigation pane, click Create > SQL warehouse.
    • databricks-sql-endpoint-1.png
  4. In the New SQL Warehouse menu, choose a name and configure the options for the new SQL warehouse. Under Advanced options turn Unity Catalog to On. Select the Preview channel, and click Create.
    • databricks-new-sql-endpoint-2.png

Step 2: Configure Access

Collect connection information and create an access token for the data transfer service.

  • In the pop up that follows, copy the token and securely save the token.
      Using a Service Principal & Token instead of your Personal Access Token

    You may prefer to create a Service Principal to use for authentication instead of using a Personal Access Token. To do so, use the following steps to create a Service Principal and generate an access token.

    1. In your Databricks workspace, click your username on top right, click Admin Settings, Identity and access, and click Manage next to the Service Principals options.
    2. Click the Add service principal button, click Add new in the modal, enter a display name > click Add.
    3. Click on the newly created Service Principal, and under Entitlements select Databricks SQL Access and Workspace Access. Click Update and make a note of the Application ID of your newly created Service Principal.
    4. Back in the Admin Settings menu, click the Advanced section (under the Workspace admin menu). In the Access Control section, next to the Personal Access Tokens row, click Permission Settings. Search for and select the Service Principal you created, select Can use permission, click Add, and Save.
    5. Navigate back to the SQL Warehouses section of your Workspace, click the SQL Warehouses tab, and select the SQL Warehouse you created in Step 1. Click Permissions on the top right, search for and select the Service Principal you created, select Can use permission, and click Add.
    6. Use your terminal to generate a Service Principal Access Token using your Personal Access Token generated above. Record the token value.

    This token can now be used as the access token for the connection.

    curl --request POST "https://<databricks-account-id>.cloud.databricks.com/api/2.0/token-management/on-behalf-of/tokens" \
    --header "Authorization: Bearer <personal-access-token>" \     
    --data '{               
      "application_id": "<application-id-of-service-principal>",
      "lifetime_seconds": <token-lifetime-in-seconds-eg-31536000>,
      "comment": "<some-discription-of-this-token>"
    }' 
  1. In the SQL Warehouses console, select the SQL warehouse you created in Step 1.
    • databricks-endpoints-console-3.png
  2. Click the Connection Details tab, and make a note of the Server hostname, Port, and HTTP path.
    • databricks-server-port-path-4.png
  3. Click the link to Create a personal access token.
    • databricks-create-personal-access-token-5.png
  4. Click Generate New Token.
    • databricks-generate-new-token-6.png
  5. Name the token with a descriptive comment and assign the token lifetime. A longer lifetime will ensure you do not have to update the token as often. Click Generate.
  6.   In the Databricks UI, select the Catalog tab, and select the target Catalog. Within the catalog Permissions tab, click Grant. In the following modal, select the principal for which you generated the access token, select USE CATALOG, and click Grant.
  7. Under the target Catalog, select the target schema (e.g., main.default, or create a new target schema). Within the schema Permissions tab, click Grant. In the following modal, select the principal for which you generated the access token, and select either ALL PRIVILEGES or the following 9 privileges and click Grant:
  • USE SCHEMA
  • APPLY TAG
  • MODIFY
  • READ VOLUME
  • SELECT
  • WRITE VOLUME
  • CREATE MATERIALIZED VIEW
  • CREATE TABLE
  • CREATE VOLUME

Step 3: Add Your Destination

  1. After completing the initial setup, share your Databricks host address, bucket vendor and bucket name with a Zuora representative who will create a connection link for you.
  2. Using the connection link shared with you by Zuora, you can securely input your Databricks database details, including the server hostname, port, catalog, schema, http path of the SQL endpoint, personal access token, finalizing the setup for the connection.
  3. After you fill in all the required Databricks details through the provided link and test the connection, saving the destination will kickstart the onboarding process and begin transferring data.

Verification and Data Transfer

For Databricks, your data will be loaded into the specified datasets and tables that you have configured during the setup process. You can access and query this data directly within your Databricks environment using SQL queries or through integrated analytics tools.

Format of Transferred Data

  • Data transferred to Databricks are loaded as properly typed tables within the specified schema. Each table corresponds to a distinct dataset or entity from your Zuora data.
  • In addition to the primary tables, a special_transfer_status table is created within the designated schema to capture transfer metadata. This table includes a transfer_last_updated_at timestamp for each dataset, providing insights into the timing of data updates.
  • The exact structure and organization of your transferred data within Databricks will be determined by the configurations that you have specified during the setup process. This ensures that your data is seamlessly integrated into your existing Databricks environment and ready for analysis and reporting.