Apache Iceberg

Learn the required and optional properties of creating an Apache Iceberg Connection, Credential, Read Connector, and Write Connector.

Apache Spark

Prerequisites

  • Access to Storage System (Amazon S3, Azure Data Lake Storage, or Google Cloud Storage)
  • Credentials for the above Storage System

Connection Properties

The following table describes the fields available when creating a new Iceberg Connection. Create a connection using the information below and these step-by-step instructions.

FieldRequiredDescription
Access TypeRequiredThis connection type is Read-Only, Write-Only, or Read-Write.
Connection NameRequiredInput your desired name.
Storage SystemRequiredAmazon S3, Azure Data Lake Storage, or Google Cloud Storage. Your selection will determine which of the following fields is required. See the below tables for each storage system type.
Path to Root Directory of CatalogRequiredWhere to store the Iceberg Data. Include a path to a folder for Ascend to access. For example, ascend-iceberg-data-folder. By default, data will be stored at the root directory.
Requires CredentialsRequiredCreating a new Iceberg connection requires a new credential or existing credential.

Storage System

Apache Iceberg requires an underlying storage system. The below tables detail the parameters required for Amazon S3, Azure Data Lake Storage, and Google Cloud Storage.

📘

Spark with Iceberg Data Plane Configuration

If you are configuring an Iceberg Connection for Spark with Iceberg Data Plane, your storage system will contain the metadata used within Ascend. Your storage system must be existing prior to configuration. Specific storage system parameters are unique to your business needs, however Ascend does not require a specific size of your storage bucket, project, or service.

Amazon S3 Bucket

FieldRequiredDescription
Amazon S3 BucketRequiredInput the bucket your data will be stored.
Amazon S3 Connection TypeOptionalYour options are Standard, With Region, AWS PrivateLink, or Custom Endpoint.
S3 RegionOptionalProvide the S3 regional endpoint you want to use.
AWS PrivateLinkOptionalProvide a custom endpoint to AWS PrivateLink.
Custom EndpointOptionalProvide the custom endpoint. Select Disable Certificate Verification if needed.

Azure Data Lake Storage

FieldRequiredDescription
Storage Account NameRequiredThe name of the storage account. This can be found in Azure Resource Manager.
Container NameRequiredThe Azure container instance name.

🚧

Disable Soft Deletes

Ascend requires that soft deletes are disabled on the Azure storage account you are using to connect.

Google Cloud Storage

FieldRequiredDescription
ProjectRequiredIndicate an existing project.
BucketOptionalIndicate an existing bucket or define a new bucket name.

Credential Properties

The following table describes the fields available when creating a new Apache Iceberg credential.

FieldRequiredDescription
Credential NameRequiredThe name to identify this credential with. This credential will be available as a selection for future use.
Credential TypeRequiredThis field will automatically populate with {connection type name}.
Iceberg CredentialRequiredS3 Credential: Include an AWS Access Key ID and AWS Secret Access Key.

Azure Credential: Select which of the three Azure credential options you want to use and provide the necessary information.

Google Credential: Provide Google Cloud credentials in JSON format.

Read Connector Properties

The following table describes the fields available when creating a new Apache Iceberg Read Connector. Create a new Read Connector using the information below and these step-by-step instructions.

FieldRequiredDescription
NameRequiredProvide a name for your connector. We recommend using lowercase with underscores in place of spaces.
DescriptionOptionalDescribes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons.
TableRequiredThe table that you are reading from. For example, blue, green, red,
Database NamespaceOptionalHow tables are categorized together. For example, the above tables would have a namespace of color.

Write Connector Properties

The following table describes the fields available when creating a new Apache Iceberg Write Connector. Create a new Write Connector using the information below and these step-by-step instructions.

FieldRequiredDescription
NameRequiredProvide a name for your connector. We recommend using lowercase with underscores in place of spaces.
DescriptionOptionalDescribes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons.
UpstreamRequiredThe upstream component containing the data to write.
TableRequiredThe table that you are reading from. For example, blue, green, red,
Database NamespaceOptionalHow tables are categorized together. For example, the above tables would have a namespace of color.
Partition Clause (partitioned by...)OptionalTable column reference for [dynamic overwrite.](dynamic overwrite mode is recommended when writing to Iceberg tables)