Azure Blob Write Connector (Legacy)

Creating an Azure Blob Write Connector

Prerequisites:

  • Access credentials
  • Data location on Azure Blob
  • Partition column from upstream, if applicable

Connection Settings

Connection Setting specifies the details to connect to the particular Azure Blob location.

Here's an example:

1896

Figure 1

Storage Account Name

Enter the Storage Account Name for the subscription.

Location

Azure Blob Write Connectors have location settings comprised of:

  • Azure Blob Container: The container name, e.g. ascend-io-dev-container. Do not put in any folder information in this box.
  • Object Prefix: The Azure Blob folders hierarchical prefix, e.g. analytical_results/my_awesome_table. Do not include any leading forward slashes.
  • Partition Folder Pattern: The folder pattern that will be generated based on the values from the upstream partition column, e.g. you can use {{at_hour_ts(yyyy-MM-dd/HH)}} where column at_hour_ts is a partition column from the upstream transform.

❗️

Warning

Any data that is not being propagated by the upstream transform will automatically be deleted under the object prefix.

For example; if the write connector produces three files A, B and C in the object-prefix and there was an existing file called output.txt at the same location Ascend will delete output.txt since Ascend did not generate it.

Manifest file (optional)

If selected a manifest file is generated or updated every time the Write Connector gets Up-To-Date and will contain a list of file names for all data files that are ready to be consumed by downstream applications. To create a Manifest File, specify the full path, including the file name, for where the Manifest File should get created at, as well as whether this should be a CSV or a JSON file.

Azure Shared Key

This is the Access Key for the Storage Account that can be located within the Storage Account page under Settings -> Access Keys.

Testing Connection

Use Test Connection to check whether all Azure Blob permissions are correctly configured.

Formatter Settings

The formatter settings specifies what format the files will be generated with.

Here's an example.

822

Figure 2

XSV Formatter

Ascend supports 3 different delimiters and 9 different line terminators and allows specifying whether a Header Row should be included. The XSV generated is RFC4180 compliant.

🚧

Important

The XSV Formatter will NOT replace newline characters within values. Replace newline characters in the upstream transform if you require XSV files to contain only single line records.

JSON formatter

Will generate a file where each line in the file is a valid JSON object representing one data row from upstream.

🚧

Important

The JSON Formatter will automatically replace new line characters in column values to \n in order to guarantee the JSON file has single line records.

Parquet formatter

Will apply snappy compression automatically to the output files automatically.

Processing Priority (optional)

When resources are constrained, Processing Priority will be used to determine which components to schedule first.

Higher priority numbers are scheduled before lower ones. Increasing the priority on a component also causes all its upstream components to be prioritized higher. Negative priorities can be used to postpone work until excess capacity becomes available.

Here's an example:

219

Figure 3