Skip to main content

General best practices

Zuora

General best practices

To query data using Data Query, Zuora recommends the following best practices.

Familiarize yourself with the business objects

Zuora recommends that you use the following statement to get familiar with Zuora objects and object schemas before you start using Data Query to retrieve data.

SHOW TABLES
SHOW COLUMNS FROM|IN {table}
DESCRIBE {table}
DESC {table}

SQL Dialect

Zuora’s Data Query engine is powered by Trino. For advanced syntax and knowledge, refer to the Trino documentation here.  

Query help on other sites, such as StackOverflow, might not be compatible with Trino. For example, postgres offers nuanced syntax compared to Trino. 

Reference all new variables with a table and variable declaration

Zuora recommends that you add a table and variable declaration when you reference new variables in the SELECT statement. For example:

Recommended Not recommended
SELECT sub.id, a.accountnumber
FROM Subscription sub, Account a
WHERE sub.initialterm > 12 AND a.balance > 0
SELECT sub.id, a.accountnumber
FROM sub, a
WHERE sub.initialterm > 12 AND a.balance > 0

Use JOINs effectively

Table order

For better query performance, Zuora recommends that you keep the smaller table on the left and the bigger one on the right when using JOIN, including INNER JOIN and Index JOIN. See the syntax below:

SELECT *
FROM smaller_left_table
INNER JOIN bigger_right_table ON smaller_left_table.bigger_right_id = bigger_right_table.id

In the following example, Rateplan is the smaller table and Rateplancharge is the larger table. You should place Rateplan to the left of the INNER JOIN and Rateplancharge to the right.

Recommended Not recommended
SELECT c.*
FROM Rateplan rp
INNER JOIN Rateplancharge c on c.rateplanid = rp.id
WHERE rp.subscriptionid = 'xxxxx000001'
SELECT c.*
FROM Rateplancharge c
INNER JOIN Rateplan rp on c.rateplanid = rp.id
WHERE rp.subscriptionid = 'xxxxx000001'

Index JOIN

Enable Index Join for data sets that you can narrow down with unique identifiers.

Index Join is an addition to LEFT, RIGHT, OUTER, and/or INNER JOINs and can be enabled whenever a JOIN is used. Index join is a useful addition to other joins when you have a specific reference value in your WHERE clause to index another large table by.

For example, you want to get Account and Subscription information tied to a specific ProductRatePlanCharge.  The ProductRatePlanCharge table is one of the largest tables in your query. However, you only need the ProductRatePlanCharge that matches the specific ID in your WHERE clause. You can use the Index Join in your query as follows:

SELECT Account.AccountNumber, Subscription.Name 
FROM ProductRatePlanCharge PRPC 
JOIN RatePlanCharge RPC ON PRPC.id = RPC.productrateplanchargeid 
JOIN RatePlan RP ON RP.id = RPC.rateplanid 
JOIN Subscription ON Sub.id = RP.subscriptionid 
JOIN Account ON Account.id = Subscription.accountid 
WHERE PRPC.id = '1375'

Note that when using Index Join, the field used in your WHERE clause must be found in the table that you are indexing (the table in your FROM statement). This field should narrow down the set of records returned from the indexed-table to be less than the query processing limit. In this example, ProductRatePlanCharge table will be indexed by theid, which is 1375, to narrow down the ProductRatePlanCharges that the query JOINs with RatePlanCharges.

Index Join does not apply to non-indexed fields. When utilizing non-indexed fields in the WHERE clause, all table rows are considered input records, which can raise the chances of reaching the query processing limit. For more information about indexed fields of Zuora objects, see Indexed fields.

You can use Index Join through the following ways:

In general, it is safe to use and helpful when you have a lot of data to scan, although sometimes you might experience problems for certain object types, usually in the case where data does not fit the format expected for the join.

Use LIMIT to sample your query results

Before your first time running a full query, Zuora recommends that you add LIMIT statement to get a quick snapshot of the first few entries. See the syntax below.

SELECT balance, accountId, invoiceDate
FROM Invoice
WHERE accountId = '123-456-7890' AND balance > 0
ORDER BY invoiceDate
DESC LIMIT 1

Once you ensure the result is desirable and meaningful, you can run the full query.

Note that the LIMIT function only affects the output records limitation.

Filter efficiently 

A SQL filter allows you to narrow down the query result set to only the rows that you are interested in. Zuora recommends the following rules to make your filter efficient.

Define clear date ranges

Define clear date ranges with updateDate, createDate, effectiveStartDate, and so on.

Recommended Not recommended
SELECT a.accountnumber
FROM Rateplancharge rpc
...
WHERE rpc.effectivestartdate >= DATE('2019-01-01')
AND rpc.effectivestartdate <= DATE('2019-08-01')
SELECT a.accountnumber
FROM Rateplancharge rpc
...
WHERE rpc.effectivestartdate <= CURRENT_DATE

Avoid using six digits of second precision for timestamp fields

To avoid input row limits in Data Query when filtering on indexed timestamp fields, make sure to define your timestamp filter values as TIMESTAMP(3) WITH TIME ZONE (millisecond precision) values and do not use TIMESTAMP(6) WITH TIME ZONE (six digits of second precision).

Recommended Not recommended
SELECT count(*)
FROM payment
WHERE updateddate >= TIMESTAMP '2023-01-02 00:00:00.000 +00:00' and
updateddate < TIMESTAMP '2023-01-03 00:00:00.000 +00:00'
SELECT count(*)
FROM payment
WHERE updateddate >= TIMESTAMP '2023-01-02 00:00:00.000000 +00:00' and
updateddate < TIMESTAMP '2023-01-03 00:00:00.000000 +00:00'

Separate a large query into smaller ones

See the following example that separates a big query into smaller ones by date ranges. Data Query can simultaneously process a maximum of 5 queries per tenant.

Date range Separated query

January - March

SELECT a.accountnumber
FROM Rateplancharge rpc
...
WHERE rpc.effectivestartdate >= DATE('2019-01-01')
AND rpc.effectivestartdate <= DATE('2019-03-31')

April - June

SELECT a.accountnumber
FROM Rateplancharge rpc
...
WHERE rpc.effectivestartdate >= DATE('2019-04-01')
AND rpc.effectivestartdate <= DATE('2019-06-30')

July - August

SELECT a.accountnumber
FROM Rateplancharge rpc
...
WHERE rpc.effectivestartdate >= DATE('2019-07-01')
AND rpc.effectivestartdate <= DATE('2019-08-01')

Use exact strings instead of strings containing a wildcard

Recommended Not recommended
SELECT a.name
FROM Account a
WHERE a.accountnumber = 'A00000977'
SELECT a.name
FROM Account a
WHERE a.accountnumber LIKE 'A%'
SELECT a.name
FROM Account a
WHERE a.accountnumber IN ('A00000977',’A00000978’,’A00000979’)
SELECT a.name
FROM Account a
WHERE a.accountnumber LIKE 'A0000097%'

A custom picklist field can be created with the exact values and then leveraged by the query in addition to the free text field.  For example:

Original field ProductName__c New additional custom field ProductCategory__c
Annual SiliDog Basic SiliDog
Annual SiliDog Premium SiliDog
Monthly SiliDog Basic SiliDog
ArctiCat Basic ArctiCat

Original clause: where ProductName__c LIKE '%SiliDog%'

Updated clause: where ProductCategory__c = 'SiliDog'

Avoid using NULL in your filter

The following filter is NOT recommended.

SELECT rpct.includedunits
FROM Rateplanchargetier rpct
WHERE rpct.discountamount IS NULL
AND rpct.discountpercentage IS NULL

Avoid using functions on the left-hand side of the comparison operator

Recommended Not recommended
SELECT a.accountnumber
FROM Rateplancharge rpc
...
WHERE rpc.effectivestartdate > DATE('2019-01-01')
SELECT a.accountnumber
FROM Rateplancharge rpc
...
WHERE DATE_DIFF('MONTH', rpc.effectivestartdate, DATE('2019-01-01')) < 0

Filter on fields with subsets of unique values

The following table lists the commonly used large tables and the fields with subsets of unique values. Zuora recommends that you use these fields to optimize your filter.

Zuora business object Fields with subsets of unique values

RatePlanCharge

  • createdDate, updatedDate, effectiveStartDate: used to filter by date range

  • billingDay: used for unique days (for example, not the 1st, 15th, or 30/31st)

RatePlanChargeTier

  • createdDate, updatedDate: used to filter by date range

  • startingUnit, endingUnit: used for batch jobs

InvoiceItem

  • chargeDate, createdDate, revRecStartDate, serviceStartDate, serviceEndDate, updatedDate: used to filter by date range

TaxationItem

  • createdDate, taxDate, updatedDate: used to filter by date range

  • locationCode, jurisdiction: used to filter location-specific data for objects like Tax

JournalEntry

  • accountPeriodId: used to group sets

  • createdDate, journalEntryDate, transferDate, updatedDate: used to filter by date range

  • notes: used as the unique information to filter by

  • status: used to filter journal entries

Usage

  • createdDate, startDateTime, endDateTime, submissionDateTime, updatedDate: used to filter by date range

Avoid using LIKE filters

LIKE filters will not be pushed down for optimization and they usually decrease the query performance. The following example is NOT recommended.

SELECT SourceType FROM usage WHERE Id LIKE '8a28b7f57cc19b8d017cc352976b7%'