12.07.2022 Release Notes

These are the release notes for Dec 07, 2022.

:rocket: FEATURES :rocket:

  • User-defined cluster pool configuration (Data Plane only)
    • A Spark cluster pool is an elastic pool of Spark clusters. Each pool dynamically scales executors up and down depending on load. Users control the scaling parameters of a single cluster and the cluster pool.
    • All Spark-related functionality for Data Plane environments now runs on consolidated Spark clusters.
    • Allows users to configure one or more Spark cluster pools on the site admin page.
    • Users can select the default site-level Spark cluster pool or a user-configured cluster pool to execute Spark tasks in a Data Service or Dataflow.

:sparkles: ENHANCEMENTS :sparkles:

  • Add incremental read support for the following Read Connectors for all Data Planes:
    • Marketo
    • Shopify
    • Google Analytics 4
  • Add incremental and snapshot replication strategies for Snowflake Read Connector
  • Salesforce Read Connector supports date type
  • Support direct copy for Databricks Data Plane
  • Avoid OOM error for Databricks Read Connector by streaming data
  • Improve performance by tuning page and partition size for Databricks Read Connector
  • Update Databricks connector version
  • Multi-table Postgres Read Connector performance improvements
  • Increase reliability of spark logs download (now a tar instead of zip archive)
  • Update Spark 3.2.0 build to 3.2.2
  • Support new api token for Hubspot Read Connector

:wrench: BUGFIXES :wrench:
All Data Planes

  • Fix MS SQL incremental read for datetime column
  • Fix a bug where the Snowflake Connection would cancel a query job after 5 minutes without a connection.
  • Fix credentials for PySpark Transforms in Data Plane environments

Spark Data Plane:

  • Fix UI issue preventing file upload for legacy Read Connectors
  • Fix updates for components with legacy credential format

BigQuery Data Plane

  • Fix BigQuery migration for Data Service/Dataflow name change;
  • Drop Dataset name restrictions
  • Fix issue causing incremental read connectors to miss partitions in BigQuery Data Plane

Databricks Data Plane

  • Drop database name restrictions

Snowflake Data Plane

  • Fix failures with Snowflake Data Plane using reserved words as column names by always uppercase and quoting references