How to integrate Google Sheets with Redshift using Talend Open Studio
The aim of integrating Google Sheets with AWS Redshift Database is to enable people without access to the database to manage and maintain the content of a dictionary table used for processes carried out on the database.
Set up technical google account
In order to prevent Talend/Redshift users from making any changes to a synchronized Google Sheet, a technical Google account with read-only access to the synced sheet has to be set up. For this account, a Service Account Email generated Key File (p12) has to be set up and saved in a safe place for Talend OS users.
Prepare Redshift target table
A table with a schema referring to column configuration of data taken from the Google Sheet needs to be created in the Redshift database.
Talend job parameterization
Talend components tGoogleDrive and tGoogleSheetInput can be found on the TalendExchange page.
A Talend job contains the following steps:
Establish a Client connection to GoogleDrive:
Here you have to enter the Service Account Email address and location of Key File (p12).
List all files on GoogleDrive accessible from the service account:
Limit access to the GoogleSheet that we are interested in:
In the Value field you must enter the name of the GoogleSheet you want to sync.
Enter the File ID of the synced GoogleSheet on GlobalMap:
GoogleSheetInput parameterization
Using the credentials from point 1 and the File ID given to GlobalMap, enter the parameters referring to the synced GoogleSheet, such as the sheet name, which row to start reading data from, the letter of the column at which you want to stop reading. Define the column configuration: edit the schema and change values in the bottom configuration table: