Azure Data Lake

Prerequisites

  • Microsoft Account
  • A Storage Account to use with Azure Data Lake Storage Gen2
  • A container inside of your Azure Storage Account

Connection Properties

The following table describes the fields available when creating a new Azure Data Lake Connection. Create a connection using the information below and these step-by-step instructions.

FieldRequiredDescription
Access TypeRequiredThis connection type is Read-Only, Write-Only, or Read-Write.
Connection NameRequiredInput your desired name.
Storage Account NameRequiredThe name of the your ALDS storage account resource in Azure.
Container NameRequiredThe name of your container in Azure.
Requires CredentialsOptionalCheck this box to create a new credential or select an existing credential.

Credential Properties

The following table describes the fields available when creating a new Azure Data Lakecredential.

Field Name

Required

Description

Credential Name

Required

The name to identify this credential with. This credential will be available as a selection for future use.

Credential Type

Required

This field will automatically populate with Azure Data Lake.

Credential Type

Required

The type of credential you want to use.

  • Azure Shared Key: Provide your Azured shared key in the field that appears.
  • Azure AD Service Principal
  • Azure AD Service JSON: Provide credentials in JSON format.

Read Connector Properties

The following table describes the fields available when creating a new Azure Data Lake Read Connector. Create a new Read Connector using the information below and these step-by-step instructions.

Field Name

Required

Description

Name

Required

Provide a name for your connector. We recommend using lowercase with underscores in place of spaces.

Description

Optional

Describes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons.

Object Pattern Matching

Required

This is a description of the field.

Container Name

Required

This is the name of our container in Azure. In Azure the name of our container in Azure is ascend-data.

Object Pattern

Required

This is the name of the data file that is in your container.

Example:AirPassengers.csv.

Parser

Required

We support several data formats. See Blob Storage Read Connector Parsers for more information about CSV, Excel, JSON, and Python parsers:

  • Avro
  • CSV
  • Excel
  • JSON
  • ORC
  • Parquet
  • Python
  • Text

Path Delimiter

Optional

Example: A newline \n de-limited file.

Write Connector Properties

The following table describes the fields available when creating a new [connection type name] Write Connector. Create a new Write Connector using the information below and these step-by-step instructions.

📘

Container Brows and Select

Browse and Select Data lets you select a location within an Azure container to write data. This will automatically file in the Container and Output Directory fields.

Field Name

Required

Description

Name

Required

Provide a name for your connector. We recommend using lowercase with underscores in place of spaces.

Description

Optional

Describes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons.

Upstream

Required

The name of the previous connector the Write Connector will pull data from.

Container Name

Optional

The container the Ascend will write data to.

Output Directory

Required

The output prefix to write files to. Use the format data/gold/aggregated_cab_rides.

  • *Note**: This prefix must be unique across all Write Connectors or Ascend will delete any existing objects in the directory.

Partition Interpolation Template

Optional

Include a value from the partition profile as part of the output directory naming. For example, to create Hive style partitioning on dataset daily partitioned on timestamp event_ts, specify the pattern as dt={{event_ts(yyyy-MM-dd)}}/.

Output File Syntax

Optional

A suffix to attach to each file name. By default, Ascend will include the extension of the file format, but you may optionally choose a different suffix.

Format

Required

We support several data formats. See Amazon S3 Write Connector Connectors for more information about CSV, Excel, JSON, and Python parsers:

  • Avro
  • CSV
  • JSON
  • ORC
  • Parquet
  • Text

Path Delimiter

Optional

Example: A newline \n de-limited file.

Manifest File

Optional

Specify a manifest file which will be updated with the list of files every time they are updated.

Write Strategy

Optional

Pick the strategy for writing files in the storage:

  • Default (Mirror to Blob Store): this strategy allows to keep the storage aligned with ascend. allows inserting, updating and deleting partitions on the blob store.
  • Ascend Upsert Partitions: This strategy allows for appending new partitions in Ascend and updating existing partitions, without deleting partitions from blob store that are no longer in Ascend.
  • Custom Function: This strategy allows you to implement the write logic that'll be executed by Ascend.

© Ascension Labs Inc. | All Rights Reserved