Using a Custom Image

Prerequisites:

  • If you would like to use your own docker image in Ascend, make sure to go through the previous article on "preparing a custom image", to have it properly set up.

Run Transforms with custom images

First, find Runtime Settings options in a transform:

  • In Pyspark and SQL Transforms, it is nested under "Advanced Settings"
  • In Scala Transform, it is marked as a required field in the creation flow.

In this dropdown, you'll find the following four options:

Option nameExplanation
Use Data Service Image (default)This is the default option for Transforms if nothing is specified.
See "Choose a default image for a Data Service" below for details.
Native Ascend SparkOnce selected, you may choose one of the standard Ascend Spark runtimes.
Image labelIf you have registered custom labels at the site level, they can be selected here.
See Register images at the site level below for details.
Container image URLEnter an image URL (for example, quay.io/my_company/spark_image:prod) and its required Spark runtime, and use it just for the current Transform.

🚧

Custom images for your transform will be cached on the underlying infrastructure as they're pulled and used.

To avoid using an old image, change the Label every time the Image is updated.

For Native Ascend Spark, if you choose to use Spark version lower than 3.1.2, it's important to note that certain limitations exist, such as the inability to apply ephemeral volumes to Spark executors.

Register images at the site level

To make managing custom images easier in Ascend, you can declare all needed images in the Site Admin settings and reference them later when building Dataflows.

πŸ“˜

Restricted access

Site Admin permissions are required. Reach out to your site admins to help you register needed images.

  1. Go to the Dashboard from the top left Ascend Icon
  2. Click on "Admin," then "Cluster Management."
  3. In the "Custom Docker Images" section, click "Add a new image."
  4. Give the image an easily referenceable name as a label, for example, "custom-image-with-Arrow-library."
  5. Enter image URL and required runtime.
  6. Hit "Create" to finish image registration.

Once complete, you may choose the registered image easily in both Data Service settings and Transforms.

Choose a default image for a Data Service

By default, all Transforms in a Data Service runs on the same initial Ascend Spark image. If you would like to default to your custom Docker image, you may:

  1. Click "Data Service Settings", after hitting the small "Gear" icon in the upper section of the left panel.
  2. Navigate to "Container Images".
  3. Choose a default image to be used for this Data Service. Follow the table above to make sense of the options.
  4. Hit "Update", and all Transforms will default to use the new image.

πŸ“˜

Register an image from a private container/image repository

It is possible that your container images are not hosted in a public container/image repository, and may require additional credentials to pull images. In this case, please reach out to Ascend support (via Intercom or Slack). We will help you set up the configuration so those images can be pulled.

NOTE: Ascend.io currently only supports private container/image repositories that use standard docker login style authentication to obtain a long-lived authentication token. Ascend.io does not currently support container/image repositories such as Amazon ECR, which uses a non-standard authentication method.