After you have an Ascend Read Connection set up, then you continue to set up the Read Connector in your dataflow.
After you have an Ascend Read Connection set up, then you continue to set up the Read Connector in your dataflow. The first thing on the top is a highlighted box with the DATA LAKE connection, with an USE button which you can use to select the connection.
- Name (required): The name to identify this connector with.
- Description (optional): Description of what data this connector will read.
- Table Path (required): The path to the Delta table from inside the bucket configured in the connection.
Once you click on the GENERATE SCHEMA button, which is an obligatory step, the data preview will be populated with the schema as in the image above.
The refresh schedule specifies how often Ascend checks the data location to see if there's new data. Ascend will automatically kick off the corresponding big data jobs once new or updated data is discovered.
Update the status of the read connector by marking it either Running to mark it active or Paused to pause the connector from running.
When resources are constrained, Processing Priority will be used to determine which components to schedule first. Higher priority numbers are scheduled before lower ones. Increasing the priority on a component also causes all its upstream components to be prioritized higher. Negative priorities can be used to postpone work until excess capacity becomes available.
The way Ascend partitions are created (from Delta Lake partitions) depends on your choice of ingestion strategy:
Full Resync or
Incremental from Delta Manifest.
Full Resync, Delta Lake’s original partitioning will not be maintained and will collapse into a single partition on each refresh.
Incremental with Delta Manifest strategy, Ascend will maintain the original Delta Lake partitioning strategy, using the manifest file to determine which partitions have been changed and have to be updated.
Updated 6 months ago