Some python libraries require JAR files to be added to the Spark environment in order to be used.
Since the files must already be present when the Spark Session is instantiated and since one can't reload the Spark Session in Ascend, the desired files must be added in a Custom Docker Image.
If you are not used to create a Custom Docker Image for Ascend, please follow the related documentation
The best way to add JARs to the Spark Session is moving them into the correct directory using the Dockerfile commands
While configuring the image through the Dockerfile, run the following steps:
- create a "jars" directory at the same level as the Dockerfile
- insert the desired JARs into the "jars" directory
- add the following command to the Dockerfile
COPY --chown=ascend:ascend /jars/*.jar /app/spark/jars/
/app/spark is the default SPARK_HOME, if you changed it for any reason keep in mind that the JAR files must be placed under $SPARK_HOME/jars
This way all the desired JARs will be loaded into the Spark Session
Updated 7 months ago