Google BigQuery

Learn the required and optional properties of creating a Google BigQuery Connection, Credential, Read Connector, and Write Connector.

google bigquery logo

Prerequisites

  • Access credentials
  • Google Cloud project name
  • BigQuery dataset and table names
  • Data Schema (column names and column type)

Connection Properties

The following table describes the fields available when creating a new BigQuery Connection. Create a connection using the information below and these step-by-step instructions.

FieldRequiredDescription
Access TypeRequiredThis connection type is Read-Only, Write-Only, or Read-Write.
Connection NameRequiredInput your desired name.
ProjectRequiredProject name in Google BigQuery where the desired data is located.
DatasetOptionalDataset within the Google project. Use the name as it appears below the project in Google BigQuery. For ex., use rain_data, not united-states-weather:rain_data.
LocationOptionalSelect the location to restrict the execution of the data to either US or EU. Execution is not restricted by default.
Requires CredentialsRequiredThis box is automatically selected.

Credential Properties

The following table describes the fields available when creating a new BigQuery credential.

FieldRequiredDescription
Credential NameRequiredThe name to identify this credential with. This credential will be available as a selection for future use.
Credential TypeRequiredThis field will automatically populate with Google BigQuery.
Google Cloud CredentialsRequiredPrivate key used to identify the Google account. When a Service Account is created, a private key is produced to provide authentication between Google Cloud and third party platforms.

The private key will be a block of JSON, provide the entire key which includes the type, project_id, private_key_id, private_key, client_email, client_id, auth_uri, token_uri, auth_provider_x509_cert_url and client_x509_cert_url.

Keep in mind that Google Service Accounts are bound to a project. Each Google project will require a new credential object.

Read Connector Properties

The following table describes the fields available when creating a new BigQuery Read Connector. Create a new Read Connector using the information below and these step-by-step instructions.

FieldRequiredDescription
NameRequiredProvide a name for your connector. We recommend using lowercase with underscores in place of spaces.
DescriptionOptionalDescribes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons.
DatasetRequiredDataset within a Google BigQuery project.
PartitioningRequiredHow you would like Ascend to partition the data. See Partitioning Strategies table below.
Table NameRequiredName of the table to be ingested.

Partitioning Strategies

The following table contains the configurations that are partition specific. Additional information on Google BigQuery Partitioning.

Partition StyleDescriptionRequired Fields
NoneNo partition is used and the data resides in one large table.TABLE NAME: Name used to identify the BigQuery table.
Partitioned TableA table that is separated into sections such as Ingestion time or Date. Google identifies partitioning as a best practice over sharding.TABLE NAME: Name used to identify the BigQuery table.

PARTITION PATTERN MATCHING: Select either Regex, Glob, or Match.

PARTITION PATTERN: Pattern used to identify the partition.
Sharded TableAnother way to sub section data. A subset of data can be pulled from multiple database tables based on a parameter such as Date.TABLE NAME PATTERN MATCHING: Select either Regex, Glob, or Match.

TABLE NAME PATTERN: Pattern used to identify the table.

Write Connector Properties

The following table describes the fields available when creating a new Google BigQuery Write Connector. Create a new Write Connector using the information below and these step-by-step instructions.

Field NameRequiredDescription
NameRequiredProvide a name for your connector. We recommend using lowercase with underscores in place of spaces.
DescriptionOptionalDescribes the connector. We recommend providing a description if you are ingesting information from the same source multiple times for different reasons.
UpstreamRequiredThe name of the previous connector the Write Connector will pull data from.
Output TableRequiredDirectory within dataset to write the data. If the table does not exist, it will be created.
Partition Field NameOptionalField name with which to divide the data into partitions.
Partition TypeOptionalSupported types are: HOUR, DAY, MONTH, YEAR. Defaults to DAY if Partition Field is specified. Not supported by the DIRECT write method.
Clustering FieldsOptionalComma-separated list of clustering columns for creating clustered table.
On Schema MismatchOptionalOptions are as follows:
- Skip schema check
- Stop and display error
- Recreate table
- Alter table
A SQL Statement for Ascend to Execute Before WritingOptionalHere, the user has the option to execute a pre-processing script before writing to final table.
A SQL Statement for Ascend to Execute After WritingOptionalHere, the user has the option to execute a pre-processing script after writing to final table.
Write StrategyOptionalFull table write: Default write strategy

Partition Write Using Ascend Partition Id: Use Ascend partitions when writing out the target BigQuery table. New partitions are inserted into the target table, and partitions that no longer appear in Ascend are deleted from the target table. This also handles changed partitions, because changed partitions have a new ID, so the old version of a changed partition is deleted and a new version is added automatically (similar to a "partition swap").

Partition Append: requires both partition field and partition type to be set. Ascend creates a target BigQuery table with partitions. After finding the largest value in the partition field for data already written, Ascend filters new partitions so that they only contain rows with partition field values greater than what is already present in the target table. Ascend only appends new partitions that contain rows which have a partition field value greater than what is already present in the target table.

Merge By Partition: requires both partition field and partition type to be set. Ascend creates a target BigQuery table with partitions, Ascend compares all partition IDs that exist in the target table to the ones in Ascend. New partitions from Ascend are inserted. If an existing partition in the target matches a partition in Ascend, Ascend compares the record count of the partition in target table vs. Ascend. If the record count differs, the partition in the target table is overwritten.

🚧

Changing Write Strategy

After you set a Write Strategy for a BigQuery Write Connector, it cannot be changed after creating the component. If you want to change the Write Strategy, you'll need to create a new Write Connector.


© Ascension Labs Inc. | All Rights Reserved