Google BigQuery

In this doc, we will cover how to setup a Connection to Google BigQuery.

Prerequisites

  • Access credentials
  • Google Cloud project name
  • BigQuery dataset and table names
  • Data Schema (column names and column type)

Create a New Google BigQuery Connection

16671667

Figure 1

580580

Figure 2

Fields denoted with a red dot are required

  • ACCESS TYPE: Select whether this connection can be used for both Read and Write Connectors.
  • CONNECTION NAME: The name to identify the connection.
  • PROJECT: Project name in Google BigQuery where the desired data is located.
  • DATASET: Dataset within the Google project. Use the name as it appears below the project in Google BigQuery. For example, use rain_data, not united-states-weather:rain_data.
  • TEMPORARY GCS BUCKET FOR WRITE: GCS Bucket for write temporary parquet file before BQ load, require BQ connection has gcs write permission. this field is necessary only for Write-only and Read-Write Access type.
  • REQUIRES CREDENTIALS: If de-selected, only the publicly accessible datasets can be accessed.
  • CHOOSE CREDENTIALS: Choose from an existing credential or create new credential for connecting to Google BigQuery if the Required Credentials checkbox is selected. When creating a new connection that requires credentials, access the drop down to select an existing credential or a new credential. The new credential will require BigQuery Data Viewer level access or higher to the Google project. Although higher permissions will result in a successful connection, keep best practices in mind and adhere to the principle of least privilege.

Create New Credentials

13421342
  • CREDENTIAL NAME: The name to identify the credential.
  • CREDENTIAL TYPE: Google BigQuery.
  • GOOGLE CLOUD CREDENTIALS: Private key used to identify the Google account. When a Service Account is created a private key is produced to provide authentication between Google Cloud and third party platforms. In the GOOGLE CLOUD CREDENTIALS field enter the private key for the selected service account. The private key will be a block of JSON, provide the entire key which includes the type, project_id, private_key_id, private_key, client_email, client_id, auth_uri, token_uri, auth_provider_x509_cert_url and client_x509_cert_url. Keep in mind that Google Service Accounts are bound to a project. Each Google project will require a new credential object.

Google BigQuery service account permissions

Service Account

  • IAM Role - BigQuery Job User

Read Only Connection

  • On DataSet → BigQuery Data Viewer

Read/Write Connection

  • On DataSet → BigQuery Data Editor or BigQuery Data Owner

Google Cloud Storage Credentials

Enter the JSON key for the service account which has Storage Admin and Storage Object Admin for the GCS path. Refer to Google documentation for more details on GCS authentication.

🚧

Storage API

We relied on the Google Storage API to write into GCS locations. As a result, we require the Storage API to be enabled. This should be enabled by default in the GCP but in case it's not, user can enable it by going to https://console.developers.google.com/apis/api/storage-api.googleapis.com/overview?project={your gcp id} to enable this API.

Testing Connection

Use TEST CONNECTION button to check whether credentials are correctly configured and If SUCCESS, click CREATE AND USE CONNECTION.


Did this page help you?