Insights Batch API
The Batch API enables scripted batch upload of CSV and JSON files to provision Insights with current data on a recurring basis.
Insights users will be provided with Batch API request parameter values that can be used to periodically invoke the Batch API. The response is a secured upload URL. A script written to handle the response can then upload a file to add account, user, or event records to the Insights Master Data Layer.
Prerequisites
Before the Batch API may be used, Zuora must first prepare Insights to receive your batch data into the Master Data Layer. Your data will be secured in the Insights Master Data Layer prior to application of transforms and loading.
Provide file samples to Zuora Insights that will match the structure of the batch files you will upload. Specific Batch File Constraints apply. Zuora will use those files to create a Data Source API Name identifying your batch. For each file sample submitted you will also get one Record Type API Name that corresponds to that file.
Zuora will provide you with values for the following request parameters:
- API Token
- Data Source API Names
- Record Type API Names
With these request parameters your customized script can securely authenticate to Zuora and get a signed URL for direct upload to the Insights Master Data Layer. Zuora will provide new Insights Batch API users with a customized sample script based on this example Shell script or you could use a modified version of the example Python script shown below. Programmatic upload of qualified data extract files can automatically update Insights according to your requirements. Your Insights representative has specific information about implementation requirements.
Scheduled Batch File Uploads
If you plan to pass daily primary metrics into Insights, Zuora will configure a Batch API data source to process one batch upload at a set time each day.
If the data source does not receive the expected batch files within 2 hours of the set time, your Insights stream pauses until the expected batch files are received. While your Insights stream is paused, any other data you pass into Insights will not be processed.
If you plan to pass data that does not need to be updated daily, such as attributes that change infrequently, Zuora will configure a Batch API data source to process batch uploads whenever the data source receives the expected batch files.
Request
POST https://nw1.files.insights.zuora.com/api/files/upload
Basic authentication is required. Pass the API Token as part of the header.
Request Parameters
dataSource | required | string | Data Source API Name - delivered to customer |
recordType | required | string | Record Type API Name - delivered to customer. Each JSON or CSV file will have a distinct value for the Record Type API Name and the Batch API is called once for each Record Type so the proper file can be uploaded t |
batchDate | required | datetime | Format: YYYY-MM-DDTHH:mm:ssZ (UTC) All record types in a data source must be uploaded with the exact same batchDate to complete the batch and begin load. Dataflow transforms and loading to Insights begins only when all record types in a data source are uploaded with the identical batchDate values. |
Response
Upon successful invocation Batch API returns a JSON string:
signedUrl | JSON string | An example value:
https:\/\/subscriber_insights-secured-fileupload.s3-us-west-2.amaz
onaws.com\/3\/31\/29\/2015\/12\/15\/111200.000\/42.0.txt.gz?AWSAcc
essKeyId=Some_Key_123&Expires=1450393826&Signature=SomeUniqueStrin
g3w%3D","dataSource":"DataSource_APIName","recordType":"RecordType
_APIName","batchDate":"YYYY-MM-DDTHH:mm:ssZ"
The signedUrl will be specific for every Record Type invocation and the combination of the AWSAccessKeyId, Expiration, and Signature ensures that only the recipient of the API token may load a specific record type file to the Master Data Layer. Data integrity is preserved and enforced at the Master Data Layer. Refer to the example shell script and the example python script to see how the signedUrl may be used to upload a file to the Master Data Layer. |
Examples
cURL Request
curl -u [API_Token]: -X POST -d 'dataSource=[DataSource_APIName]&recordType=[RecordType_APIName]&batchDate=2015-12-15T11:12:00Z' "https://nw1.files.insights.zuora.com/api/files/upload"
JSON Response
{ "signedUrl": "https:\/\/zinsights-zin-prod-fileupload.s3-us-west-2.amazonaws.com\/3\/31\/29\/2015\/12\/15\/111200.000\/42.0.txt.gz?AWSAccessKeyId=Some_Key_123&Expires=1450393826&Signature=SomeUniqueString3w%3D", "dataSource": "DataSource_APIName", "recordType": "RecordType_APIName", "batchDate": "2015-12-15T11:12:00Z" }
Sample Shell Script
Shell Script Batch API Invocation. This script can be customized with your request parameters to push your files from a specified directory to the signedUrl created for the record types in your Batch API data source.
#!/bin/bash # This script depends on the command-line utilities 'curl' and 'jq' # # To upload a batch: # # 1) Create a new directory for the batch, # For example: /data/batches/2015-12-01 # 2) Create a file for each record type in the data source, # For example: /data/batches/2015-12-01/accounts.csv # 3) Call the script with the path to the batch directory, # For example: /data/scripts/upload-file.sh /data/batches/2015-12-01 # # Use the current time as the 'batch date' BATCH_DATE=$(date +'%Y-%m-%dT%H:%M:%S%z') # # Insert your API token TOKEN=[YOUR_TOKEN] # # Insert your data source API name DATA_SOURCE=[YOUR_DATA_SOURCE] # # Insert the API names of the record types in this data source RECORD_TYPES=[RECORD_TYPE1 RECORD_TYPE2 RECORD_TYPEX) # # The directory of the batch to upload if [ -z "$1" ]; then echo "Usage: $0 batch_directory" exit 1 fi BATCH_DIRECTORY=$1 if [ ! -d "$BATCH_DIRECTORY" ]; then echo "Failed to upload batch: directory $BATCH_DIRECTORY not found." exit 1 fi for record_type in $RECORD_TYPES; do file="$BATCH_DIRECTORY/${record_type}.csv" if [ ! -f "$file" ]; then echo "Failed to upload batch: file $file not found." exit 1 fi done for record_type in $RECORD_TYPES; do file="$BATCH_DIRECTORY/${record_type}.csv" echo "Uploading $file..." response=$(curl -s -u ${TOKEN}: -d "dataSource=${DATA_SOURCE}&recordType=${record_type}&batchDate=${BATCH_DATE}" https://files.dev.insights.zuora.io/api/files/upload) if [ $? -ne 0 ]; then echo "Failed to retrieve file upload URL:" curl -u ${TOKEN}: -d "dataSource=${DATA_SOURCE}&recordType=${record_type}&batchDate=${BATCH_DATE}" https://files.dev.insights.zuora.io/api/files/upload exit 1 fi signed_url=$(echo "$response" | jq -r .signedUrl) if [ $? -ne 0 ]; then echo "Failed to parse API response: $response" exit 1 fi curl --upload-file "$file" "$signed_url" done
Example Python Script
Python Script Batch API Invocation. This Python script can be customized with your request parameters to push your files from a specified directory to the signedUrl created for the record types in your Batch API data source.
#!/usr/local/bin/python3 # This script depends on the python packages 'requests' and 'pytz' # # To upload a batch: # # 1) Create a new directory for the batch, # For example: /data/batches/2015-12-01 # 2) Create a file for each record type in the data source, # For example: /data/batches/2015-12-01/accounts.csv # 3) Call the script with the path to the batch directory, # For example: /data/scripts/upload-file.py /data/batches/2015-12-01 # from datetime import datetime import pytz import requests import sys import os # # Use the current time as the 'batch date' batch_date=datetime.now(pytz.utc).strftime('%Y-%m-%dT%H:%M:%S%z') # # Your API token token="QJXNZxMiwORqjyKArmYexHLhgWuBekjJ" # # The data source API name data_source="fileupload" # # The API names of the record types in this data source record_types=["account", "accountsubscriptions"] # # The directory of the batch to upload if len(sys.argv) != 2: print("Usage: %s batch_directory" % sys.argv[0]) sys.exit(1) batch_directory=sys.argv[1] if not os.path.isdir(batch_directory): print("Failed to upload batch: directory %s not found or is not a directory" % batch_directory) sys.exit(1) for record_type in record_types: file = os.path.join(batch_directory, "%s.csv" % record_type) if not os.path.exists(file): print("Failed to upload batch: file $file not found.") sys.exit(1) for record_type in record_types: file = os.path.join(batch_directory, "%s.csv" % record_type) print("Uploading %s..." % file) params = { 'dataSource': data_source, 'recordType': record_type, 'batchDate': batch_date } response = requests.post('https://files.dev.insights.zuora.io/api/files/upload', auth=(token, ''), data=params) if response.status_code != requests.codes.ok: print("Failed to retrieve file upload URL for dataSource=%s recordType=%s batchDate=%s: %s" % (data_source, record_type, batch_date, response.text)) sys.exit(1) signed_url=response.json()['signedUrl'] with open(file, 'rb') as f: response = requests.put(signed_url, data=f) if response.status_code != requests.codes.ok: print("Failed to upload file for dataSource=%s recordType=%s batchDate=%s: %s" % (data_source, record_type, batch_date, response.text)) sys.exit(1)
Batch File Constraints
- File Size Limit: 100MB when uploaded via Batch API
- CSV File Format Requirements
- Comma separator: ,
- Headers: Yes - column_names
- Column Names: valid_api_names
- 30 characters or less
- Starts with a letter
- Lower case
- No spaces
- Encoding: UTF-8
- Quoting with double quotes, "," necessary when strings contain commas or special characters.
- Nulls should be expressed and will be interpreted as an empty string: ,,
- Date Format: ISO-8601 "standard" (eg. YYYY-MM-DD)
- Timestamp format: ISO-8601 "standard" (eg. YYYY-MM-DD HH:mm:SS Z)
- Boolean format: true or false only
- Number format (no thousand separator, period as decimal separator, no currency symbol)
- JSON File Format Requirements
- All of the above except separators, headers, quoting, and nulls,
- Follows JSON standard
- Consistent records separated by new lines e.g.
{ } /n
{ } /n
Changing the file data structure after the Master Data Layer was prepared using sample CSV or JSON files may lead to inconsistent data. Adding new columns of data at runtime is generally handled well by the master data layer, but changing or omitting primary key columns may break the schema created with a different data schema at implementation.
Exceptions
Exception | Condition |
---|---|
header: HTTP/1.1 401 Unauthorized | Basic authentication failed. Check API token value. |
invalid token: Please authenticate to access this resource. | Basic authentication failed. Check API token value. The signed S3 URL is valid for only one hour after it is generated. Perform the batch upload within one hour of generating the signed S3 URL with the first API call. |
Failed to retrieve file upload URL for dataSource=[someName] recordType=[someRecord] batchDate=YYYY-MM-DDT12:34:56+0000: Please authenticate to access this resource. | Basic authentication in the request header may not be set with the proper API Token value. |
ArgumentNullException | Message is null. |
Failed to retrieve file upload URL for dataSource=<datasourcename> recordType=<recordName> batchDate=<batch_date>: Invalid query parameter dataSource | Please check the name of the data source used in the query. Contact Zuora Global Services if the issue persists. |
Failed to retrieve file upload URL for dataSource=<datasourcename> recordType=<recordName> batchDate=<batch_date>: Invalid query parameter recordType | The record type is invalid (misspelled or does not exist). Check with Zuora Global Services to get the list of records in the data source. If you would like to send files corresponding to additional record types, contact Zuora Global Services with the data source name and record information. |
Failed to retrieve file upload URL for dataSource=<datasourcename> recordType=<recordName> batchDate=: Invalid ISO timestamp (valid format is yyyy-MM-ddTHH:mm:ssTZ, where TZ is Z or +HH:mm or -HH:mm) | The batchData has an invalid format. For example: |