Dataflows are continuously running data pipelines. You design Dataflow with a combination of declarative configurations, SQL, and Python. As you design Dataflows, you are creating a blueprint that is sent to the Ascend Dataflow Control Plane to be automated.
Dataflows have multiple child objects:
- Read Connectors: Read Connectors are how Ascend connects to and synchronizes data coming into a Dataflow. Ascend supports many out-of-the-box connectors, as well as a Custom Read Connectors framework for writing your own.
- Transforms: Transforms are how data is modified within Ascend. Transforms can be written in either SQL or PySpark.
- Write Connectors: Write Connectors are how Ascend connects to and synchronizes data coming out of a Dataflow.
- Data Feeds: Data Feeds are how you connect multiple Dataflows together, as well as, expose "data as an API" to external systems such as Jupyter, Tableau, and Zeppelin.
- Select the Data Service dropdown at the top left corner of your browser, and select the Data Service where you would like to create your new Dataflow.
- Select the Dataflow dropdown next to the Data Service dropdown, and then click on + Create a new Dataflow.
This will bring up a panel where you give a name to this new Dataflow (required). You can also enter more details to describe your Dataflow (optional).
Click the Create button and a new dataflow is generated in Ascend!
Once you've set up your initial Dataflow(s), we suggest coming back here to explore some advanced Dataflow topics, such as:
- Importing & Exporting: Making the most of Ascend's declarative model for rapid branching & merging of Dataflow definitions.
- Workspace: Using the Dataflow Workspace to navigate & explore other components while editing.
- Performance Tuning: Take advantage of Ascend's Profiling & Statistics that analyzes every piece of data, and visualize performance in the Dataflow Timeline.
Updated about a year ago