ADLS on Gen2
Learn the required and optional properties of creating an ADLS Gen2 Connection, Credential, Read Connector, and Write Connector.
Prerequisites
- ADLS account
Connection Properties
The following table describes the fields available when creating a new ADLS Gen 2 Connection. Create a connection using the information below and these step-by-step instructions.
Field | Required | Description |
---|---|---|
Access Type | Required | This connection type is Read-Only, Write-Only, or Read-Write. |
Connection Name | Required | Input your desired name. |
Storage Account Name | Required | |
Container Name | Required | |
Requires Credentials | Optional | |
Description | Optional | Add a description of this Connection. |
Credential Properties
The following table describes the fields available when creating a new ADLS Gen 2 credential.
Field | Required | Description |
---|---|---|
Credential Name | Required | The name to identify this credential with. This credential will be available as a selection for future use. |
Credential Type | Required | This field will automatically populate with ADLS Gen 2 . |
Credential Type | Required | Azure Shared Key, Azure AD Service Principal (requires Tenant ID, Client ID, and Client Secret), or Azure AD Service Principal JSON |
Read Connector Properties
The following table describes the fields available when creating a new ADLS Gen 2 Read Connector. Create a new Read Connector using the information below and these step-by-step instructions.
Field Name | Required | Description |
---|---|---|
Name | Required | Provide a name for your connector. We recommend using lowercase with underscores in place of spaces. |
Description | Optional | Describes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons. |
Container Name | Required | Name of the container. |
Object Pattern Matching | Required | The pattern strategy used to identify eligible files: - Glob: Glob applies a simple pattern matching algorithm. - Match: Matches the pattern precisely character-for-character. - Prefix: Matches the pattern with the specified prefix. - Regex: Regex applies a pattern matching algorithm. |
Parser | Required | We support several data formats. See Blob Storage Read Connector Parsers for more information about CSV, Excel, JSON, and Python parsers: - Avro - CSV - Excel - JSON - ORC - Parquet - Python - Text |
Path Delimiter | Optional | Example: A newline \n de-limited file |
Object Aggregation Strategy | Required | Currently available strategies are: - Adaptive - Leaf Directory - Prefix Regex Match - Reshape on Metadata |
Regex to Match Files in Zip File | Optional | Only file names matching the pattern will be extracted from ZIP file. Defaults to no filtering. |
Data Replication Strategy | Optional | Defines the data replication strategy. Default unselected replicates all source changes. See Replication Strategies for Blob Store. |
Write Connector Properties
The following table describes the fields available when creating a new ADLS Gen 2 Write Connector. Create a new Write Connector using the information below and these step-by-step instructions.
Field Name | Required | Description |
---|---|---|
Name | Required | Provide a name for your connector. We recommend using lowercase with underscores in place of spaces. |
Description | Optional | Describes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons. |
Upstream | Required | The name of the previous connector the Write Connector will pull data from. |
Container Name | Required | Container that the output data will be written to. |
Output Directory | Required | Directory within bucket to write the data. If the folder does not exist, it will be created. |
Partition Interpolation Template | Optional | Include a value from the partition profile as part of the output directory naming. For example, to create Hive style partitioning on dataset daily partitioned on timestamp event_ts , specify the pattern as dt={{event_ts(yyyy-MM-dd)}}/ . |
Output File Syntax | Optional | A suffix to attach to each file name. By default, Ascend will include the extension of the file format, but you may optionally choose a different suffix. |
Format | Required | We support several data formats. See Amazon S3 Write Connector Connectors for more information about CSV, Excel, JSON, and Python parsers: - Avro - CSV - JSON - ORC - Parquet - Text |
Path Delimiter | Optional | Example: A newline \n de-limited file |
Manifest File | Optional | Specify a manifest file which will be updated with the list of files every time they are updated. |
Write Strategy | Optional | Pick the strategy for writing files in the storage: - Default (Mirror to Blob Store): this strategy allows to keep the storage aligned with ascend. allows inserting, updating and deleting partitions on the blob store. - Ascend Upsert Partitions: This strategy allows for appending new partitions in Ascend and updating existing partitions, without deleting partitions from blob store that are no longer in Ascend. - Custom Function: This strategy allows you to implement the write logic that'll be executed by Ascend. |
Updated about 1 year ago