Custom Docker Images

Introduction

By default, all PySpark transforms in Ascend run on our native containers. The containers' image builds upon the latest stable version of Spark and Scala, also bundles up a few useful packages:

  • The Ascend SDK
  • Ascend's connection definitions
  • Ascend's utility functions to make developing easier (such as logging)
  • Commonly used Python packages. (check out section "Pre-installed Python Packages" below)

There are two cases where you might want to change the default image used.

  1. Use a different version of Spark and/or Scala.
  2. Use your own image with additional Python packages or reusable functions.

For case #1, continue to "Using a Custom Image". For case #2, start with "Preparing a Custom Image".