11.28.2023 Release Notes

These are the release notes for November 28, 2023

:newspaper: NEWS :newspaper:

  • Coming Soon: Pre-defined Ascend Cluster Pool sizes!
    • You told us that predefined "T-Shirt 👕"-based sizes for Ascend Clusters would greatly simplify cluster setup and optimization for workloads. We've listened to your feedback!
    • Customers will be able to choose from sizes ranging from 3XSmall to 4XLarge.
    • Documentation and guidance for use will be provided when the feature is released, so stay tuned!
  • Release Cadence Changes:
    • You may have noticed a change to our release cadence - about every 3 weeks excepting holidays or company events.
    • This change was made directly in response to customer requests and is intentional as we adjust our process.
    • This change increases the size of our releases, so you'll notice more changes in each one.
    • Your feedback on the new release cadence is welcomed. More to come on this topic too, so stay tuned!

:sparkles: ENHANCEMENTS :sparkles:

  • All environments (Gen1/Gen2)
    • Add the ability to see the "Created By" and "Last Edited By" user in the Component UI (finally!).
    • We've added two new security roles that can be applied at the Data Service and Dataflow level:
      • Read Only (Data Restricted): a more restricted role than a read-only user, unable to view records, partition information, Data Quality, or the Debug panel.
      • Operator: similar to Read Only role, with the ability to refresh components, reset errors, and pause/resume components.
    • Add an option in Azure Event Hubs connection config to skip past data loss errors upon encountering them.
    • Add a Data Version field in Kafka and Azure Event Hubs connectors to support triggering full reprocessing.
      • Change the Data Version field to any unique value (that is different than the previous value) and the component will be completely re-initialized (existing data purged and current cursor reset).
    • Add a Read Connector-level AccountId override field for the Facebook Ads connection (currently in Private Preview).
    • Include command outputs and errors for Python PIP installation failures in Custom Python Read Connectors.
  • Gen2 environments
    • Upgrade spark-bigquery-connector from 0.32.2 to 0.33.0.
    • Upgrade the default Spark version for Ascend Clusters to 3.4.0 for all new/empty Gen2 environments.

:wrench: BUGFIXES :wrench:

  • All environments (Gen1/Gen2)
    • Fix BigQuery Read Connector schema generation error: "Length of object does not match with length of fields".
    • Resolve "400 Incompatible table partitioning specification. Expects partitioning specification interval(type:day,field:date) clustering(field), but input partitioning specification is interval(type:day,field:date)" error in BigQuery Write Connector.
    • Fix the "Write strategy cannot be changed" error when attempting to update the Write Strategy in a Write Connector that is not a BigQuery Write Connector.
    • Add ability to enable case-sensitivity when using an incremental column in Snowflake Read Connectors.
    • Resolve "Invalid Identifier ASCEND__PARTITION_ID" error in sequential Merge Transforms, as well as when using a Data Share before (or after) a Merge Transform.
    • Prevent users from updating the name of a Data Service to be the same as another Data Service that already exists.
    • Fix a bug where the Read Connector Schema Mismatch Check invokes the default Ascend cluster instead of the Ascend Cluster configured for ingest.
    • Fix the issue that a non-Site Admin user cannot see graphs in the Observe feature.
    • Fix a bug in the UI that causes very long table names to render incorrectly on Component UI views.
    • Fix instances of "Stream removed" error that occur when multiple Read Connectors refresh at the same time.
    • Resolve "AttributeError: module 'numpy' has no attribute 'bool'" error when using Pandas.
    • Resolve "No such file or directory: 'pycache'" errors in Custom Python Read Connectors by synchronizing parallel pip installation commands.
    • AWS customers only
      • Fix a regression by reinstating an accidentally removed expiration policy for the tmp Ascend S3 bucket in Ascend environments running on AWS.
        • We also changed where temporary objects are written to take advantage of this bucket with the expiration policy applied.
        • This will help customers on AWS save on unexpected storage costs over time.
      • Enforce the use of Instance Metadata Service Version 2 (IMDSv2) in Amazon Machine Image (AMI).
  • Gen2 environments
    • All Data Planes
      • Remove timeout during task processing to prevent task failures like "timeout on waiting for Spark cluster" due to slow-starting Ascend Cluster.

:warning: DEPRECATED :warning:

  • Spark 3.0.0 support in Ascend Docker Container images is deprecated and will be removed.
    • No impact to existing customers is expected.