The aim of integrating Google Sheets with AWS Redshift Database is to enable people without access to the database to manage and maintain the content of a dictionary table used for processes carried out on the database.

Integration schema

Set up technical google account

In order to prevent Talend/Redshift users from making any changes to a synchronized Google Sheet, a technical Google account with read-only access to the synced sheet has to be set up. For this account, a Service Account Email generated Key File (p12) has to be set up and saved in a safe place for Talend OS users.

Prepare Redshift target table

A table with a schema referring to column configuration of data taken from the Google Sheet needs to be created in the Redshift database.

Talend job parameterization

Talend components tGoogleDrive and tGoogleSheetInput can be found on the TalendExchange page.

A Talend job contains the following steps:

TOS job steps

Establish a Client connection to GoogleDrive:

Set connection
Here you have to enter the Service Account Email address and location of Key File (p12).

List all files on GoogleDrive accessible from the service account:

list files

Limit access to the GoogleSheet that we are interested in:

apply filter
In the Value field you must enter the name of the GoogleSheet you want to sync.

Enter the File ID of the synced GoogleSheet on GlobalMap:

ID on GlobalMap

GoogleSheetInput parameterization

Using the credentials from point 1 and the File ID given to GlobalMap, enter the parameters referring to the synced GoogleSheet, such as the sheet name, which row to start reading data from, the letter of the column at which you want to stop reading. Define the column configuration: edit the schema and change values in the bottom configuration table:
GS parameters

Output the data fetched from the GoogleSheet to the Redshift table.