Ascend Developer Hub

Data Access Methods

Allowing external data access into Ascend can be beneficial for a number of use cases, but it's important to understand the two primary means one can use to accomplish this:

  • External data access via the Web API
  • External data access via the Structured Data Lake (SDL) API

Python SDK and CLI


The Ascend Python SDK is very easy to install and will give you access controls over streaming data from any component in Ascend to your own machines.

Common Use cases

A common use case for the Web API is to use a Python notebook such as Jupyter or Zeppelin for navigating and accessing dataflows within Ascend as well as streaming records from any component within a dataflow into the notebook for additional processing. Over time additional capabilities such as a Queryable Dataflows SDK and CLI (via python SDK) will leverage the Web API.

More details on configuration and usage can be found in the Jupyter and Zeppelin sections of the documentation.

Structured Data Lake

The Ascend Structured Data Lake (SDL) provides direct access to data in Ascend via a high-speed byte-level S3-compatible API. The data is formatted as Parquet and is most useful in environments having separate Spark infrastructure or applications that can natively ingest Parquet.

Common Use Cases

A common use case for SDL is to use a Spark notebook such as Databricks, Jupyter or Zeppelin for accessing Parquet data directly from Ascend for use with Spark dataframes natively. This provides users a more flexible (more than SQL) interface and high-bandwidth path for processing fragment data with custom code in a secure manner. For example, a Data Scientist can read data from Ascend into a notebook and perform further analysis via a dataframe in Spark or even Pandas.

Other use cases include:

  • Any external spark infrastructure such as Azure Databricks, Amazon EMR, or Google Dataproc.
  • Any application that can natively work with parquet formatted data such as Presto
  • Most S3-compatible clients, such as Amazon's aws s3 CLI.

More details on configuration and usage can be found in the Databricks, Jupyter, Zeppelin, and Structured Data Lake sections of the documentation.

Updated 7 days ago

Data Access Methods

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.