11.28.2023 Release Notes

These are the release notes for November 28, 2023

📰 NEWS 📰

Coming Soon: Pre-defined Ascend Cluster Pool sizes!
- You told us that predefined "T-Shirt 👕"-based sizes for Ascend Clusters would greatly simplify cluster setup and optimization for workloads. We've listened to your feedback!
- Customers will be able to choose from sizes ranging from 3XSmall to 4XLarge.
- Documentation and guidance for use will be provided when the feature is released, so stay tuned!
Release Cadence Changes:
- You may have noticed a change to our release cadence - about every 3 weeks excepting holidays or company events.
- This change was made directly in response to customer requests and is intentional as we adjust our process.
- This change increases the size of our releases, so you'll notice more changes in each one.
- Your feedback on the new release cadence is welcomed. More to come on this topic too, so stay tuned!

✨ ENHANCEMENTS ✨

All environments (Gen1/Gen2)
- Add the ability to see the "Created By" and "Last Edited By" user in the Component UI (finally!).
- We've added two new security roles that can be applied at the Data Service and Dataflow level:
  - Read Only (Data Restricted): a more restricted role than a read-only user, unable to view records, partition information, Data Quality, or the Debug panel.
  - Operator: similar to Read Only role, with the ability to refresh components, reset errors, and pause/resume components.
- Add an option in Azure Event Hubs connection config to skip past data loss errors upon encountering them.
- Add a Data Version field in Kafka and Azure Event Hubs connectors to support triggering full reprocessing.
  - Change the Data Version field to any unique value (that is different than the previous value) and the component will be completely re-initialized (existing data purged and current cursor reset).
- Add a Read Connector-level AccountId override field for the Facebook Ads connection (currently in Private Preview).
- Include command outputs and errors for Python PIP installation failures in Custom Python Read Connectors.
Gen2 environments
- Upgrade spark-bigquery-connector from 0.32.2 to 0.33.0.
- Upgrade the default Spark version for Ascend Clusters to 3.4.0 for all new/empty Gen2 environments.

🔧 BUGFIXES 🔧

All environments (Gen1/Gen2)
- Fix BigQuery Read Connector schema generation error: "Length of object does not match with length of fields".
- Resolve "400 Incompatible table partitioning specification. Expects partitioning specification interval(type:day,field:date) clustering(field), but input partitioning specification is interval(type:day,field:date)" error in BigQuery Write Connector.
- Fix the "Write strategy cannot be changed" error when attempting to update the Write Strategy in a Write Connector that is not a BigQuery Write Connector.
- Add ability to enable case-sensitivity when using an incremental column in Snowflake Read Connectors.
- Resolve "Invalid Identifier ASCEND__PARTITION_ID" error in sequential Merge Transforms, as well as when using a Data Share before (or after) a Merge Transform.
- Prevent users from updating the name of a Data Service to be the same as another Data Service that already exists.
- Fix a bug where the Read Connector Schema Mismatch Check invokes the default Ascend cluster instead of the Ascend Cluster configured for ingest.
- Fix the issue that a non-Site Admin user cannot see graphs in the Observe feature.
- Fix a bug in the UI that causes very long table names to render incorrectly on Component UI views.
- Fix instances of "Stream removed" error that occur when multiple Read Connectors refresh at the same time.
- Resolve "AttributeError: module 'numpy' has no attribute 'bool'" error when using Pandas.
- Resolve "No such file or directory: 'pycache'" errors in Custom Python Read Connectors by synchronizing parallel pip installation commands.
- AWS customers only
  - Fix a regression by reinstating an accidentally removed expiration policy for the tmp Ascend S3 bucket in Ascend environments running on AWS.
    - We also changed where temporary objects are written to take advantage of this bucket with the expiration policy applied.
    - This will help customers on AWS save on unexpected storage costs over time.
  - Enforce the use of Instance Metadata Service Version 2 (IMDSv2) in Amazon Machine Image (AMI).
Gen2 environments
- All Data Planes
  - Remove timeout during task processing to prevent task failures like "timeout on waiting for Spark cluster" due to slow-starting Ascend Cluster.

⚠️ DEPRECATED ⚠️

Spark 3.0.0 support in Ascend Docker Container images is deprecated and will be removed.
- No impact to existing customers is expected.