Salesforce Read Connector

After you have an Ascend Read Connection set up, then you continue to set up the Read Connector in your dataflow.

Creating a New Salesforce Connector

The initial configuration of your Salesforce Connector can load all historical data or data dating back to a specified date. If you are uploading a large amount of data on the initial connection, there is an option to load the data by month to avoid timeouts in Salesforce. The Salesforce Connector will ingest every Salesforce record that has changed since the last refresh once the initial data is uploaded into Ascend.

1298

Figure 1

  • NAME: Read connector name to identify the specific data being collected.
  • DESCRIPTION: Additional information about the connector.
1918

Figure 2

  • BROWSE CONNECTION: Click this button to explore Salesforce objects to ingest. Select the object you want to ingest and press CONFIRM.
  • SALESFORCE OBJECT: The name of the object such as Account or Activity History.
  • USE SALESFORCE BULK API (COMPOUND FIELD WILL BE EXCLUDED): Ascend supports both REST and Bulk API use. Check this box to utilize the Bulk API. Salesforce does not support the use of compound fields in the Bulk API. For additional information, see Salesforce's Bulk API.
  • RECORDS AS JSON STRING: If selected, the data in the Generate Schema section will be returned as a JSON string of the full record.
  • QUERYALL (INCLUDE RECORDS THAT ARE DELETED): If unchecked, Ascend will not load records that are soft deleted. Note you will not be able to recreate the current state of the table if this is unchecked.
1592

Figure 4

If this box is not selected the data will be returned with multiple columns and rows.

1592

Figure 5

  • LOAD BY MONTH (FOR LARGE TABLE TO AVOID TIMEOUT): If selected, Ascend will ingest the data in month increments to help avoid a timeout with the Salesforce API. Each increment will be saved before loading the next increment, with the data being stored in a Partition, the details of which can be viewed in the Partitions tab.
  • LOAD START DATE: For the initial data load, specify the date you would like to start pulling from in the format YYYY-MM-DD.
  • DATE FIELD OVERRIDE: This is used to identify which field on the table to use for CDC (See details below in CDC section)

Generate Schema

Once you click on the GENERATE SCHEMA button, the parser will create a schema and a data preview will be populated.

  • Add schema column: Add a custom column to the generated schema.

The schema will be generated according to if the RECORDS AS JSON STRING box is checked.

Refresh Schedule

The refresh schedule specifies how often Ascend checks the table to see if there's new data since the last refresh (see CDC section below). If new data is found, Ascend will automatically start the ingestion process for the new data.

746

Figure 6

CDC

Ascend has implemented a CDC approach for Salesforce tables that saves you from doing a full load every time you want to get the current snapshot. Instead, we only pull updated records from your tables since the last load. On the initial load, Ascend will pull in all data in the table, or a subset based upon the "Load Start Date". How Ascend determines what data to pull on subsequent loads is dependent on your "Load By Month" selection.

Load By Month = Yes
In this case Ascend will be grouping data into monthly batches (or partitions) that you can inspect in the Partitions tab. If you set the Refresh Schedule more frequent than monthly, each refresh will pull all data for the current month. For example, if your Refresh Schedule is set to daily at 1:00 AM, the refresh on June 15th at 1:00 AM will pull data from June 1 to June 15. The refresh on June 16 will pull data from June 1 to June 16 which will replace the June 1 to June 15 data.

Load By Month = No
In this case Ascend will use a date field to determine which records to load. By default, Ascend will check for a date field in the following order to use for CDC. SystemModStamp, CreatedDate, LastUpdatedDate, LoginDate. If you select a Date Field Override it will use this date for CDC. For initial load, Ascend will bundle all of the data in the table into a single Partition, which can be inspected in the Partitions tab. Subsequent loads will use the CDC field to bring in only the changed records. Ascend highly recommends you not use Hard Deletes as those records cannot be detected by CDC.

For example, if your refresh schedule is set to hourly, if your first data load is at 2:00 PM on June 15, Ascend will bring in the entire table in a single partition. The next load at 3:00 PM will use the CDC date to only load changed records between 2:00 PM and 3:00 PM based upon your CDC date field. These will be put into a second partition that you can inspect in the partitions tab. This process will continue hourly.

As is typical with this CDC approach, you will have a record in your destination table every time there was a change to the record during the refresh period. Given this, you will likely want your first transform to get the latest record based on your CDC date to reflect the current state of the table.

Advanced Settings

680

Figure 7

  • Assigned Priority: Is used to determine which components to schedule first. Higher priority numbers are scheduled before lower ones. Increasing the priority on a component also causes all its upstream components to be prioritized higher. Negative priorities can be used to postpone work until excess capacity becomes available.