Running python notebooks in Colab can significantly ease your daily tasks. Here is how to do it for one of those tasks in 6 simple steps.
Google provides you with some free CPU and even a small scale GPU that’s perfect for running simple tasks through its hosted Jupyter notebook app called Google Colab.
Formerly known as Google Colaboratory, now Google Colab requires no setup, and if you have a Google account, accessing it is as easy as visiting its website.
Once you’re in, all you need to click is the “New Notebook” button. Then you’ll have your own hosted notebook, ready to be used as below:
Your first notebook will create a location in your Google Drive automatically a folder called “Colab Notebooks” to store all your work automatically.
After a brief instruction of the tool, let’s dive into our code to append your sheets that are located in your Drive into a new Google Sheets. The whole process consists of 6 simple steps.
Step 1 — Authenticate Google Account to be able to make changes in your Google Sheets
In order to read and make changes in your Google Sheets documents, you first need to authenticate Google Colab to access your Google Drive documents. Even though both apps belong to Google, access between them requires exclusive authentication from your side.
If you run the code below in the first cell of the Notebook, you’ll be good to make any changes in your Google Drive documents.
See the full code in my Github page
Running the cell will generate a link and ask for the auth code.
Click the link to complete the sign-in process.
As stated in the image on the left, this authentication will allow you to see, edit, create, and edit all files in your Google Drive. So make sure to double-check any code you run from your notebook to avoid irreversible mistakes.
Allow the changes to be made in your documents.
Once allowed, the system will generate a code. Copy the code and paste it into Colab to complete the authentication process.
Step 2 — Read the files from the Google Sheet URLs and take the data into a pandas dataframe
The second step is to read the files in your Google Drive from the Colab notebook.
This step consists of opening the Google Sheets to read the data from, select the worksheets the data is located in, get the data, and finally place the data into a pandas dataframe.
One note here is that the link doesn’t have to be a publicly available one since all the permissions are granted, so you may just copy the link of your Google Sheets document from your browser.
Step 3 — Set the columns and specify the data range in the dataframe
After taking the data into a dataframe, setting the columns and specifying the data range is the following step.
If your columns are part of the data and are not positioned at the column part where currently there is “0” and “1” is located on the image here, then setting the columns and specifying the data range with the following code is needed. If you already have your column names in place, you may skip this step.
The updated df1 dataframe can be seen here when the execution is completed.
Step 4 — Choose the same column names from the different dataframes to append later
One key point in the append function is that the number of columns in the selected worksheet should match. Otherwise, the append command would generate an error in the next step. Therefore, you need to choose the columns with the same name in advance, to append the right values to each other.
In the example below, I’ve selected my “Sales” and “Brand” columns that exist in both sheets. You may rename them or add as much as columns you want, as long as all can be found in both files.
Be careful with the spaces before or after the column name.
#Step 5 — Append the dataframes
The data sets are finally ready to be appended. Therefore, we just need to run our single line code to complete the append function.
In this example, I’ve focused on append but based on your need you may also use “merge”, “join” or “concatenate” functions based on your need. You may check the pandas documentation to discover these other functions in detail.
It’s important to check if your append is successfully completed by running “df” see the latest version of your dataframe before writing your new data to your Google Sheets.
Step 6 — Open the Google Sheets file you want to write your new data in and then insert the data with the set function
Last but not the least, you need to open a new Google Sheet document and choose its relevant worksheet to write your new dataframe into it with the “set_with_dataframe” function.
Congrats! You have now appended your Google Sheets without even downloading them to your computer. This doesn’t mean that you cannot do the same thing locally on your computer but since that’s a more common practice, I haven’t covered it in this piece.
Also, the aim of this piece was not only to show you append sheets but also make you aware that you have a free processing power in the cloud, called Google Colab and show you how to interact with your files in your Google Drive to be able to use it for many other use cases.
Till next time!
This article is originally published on Medium.