Really, really basic Jupyter Notebook connected to a Google Sheet and scheduled on NBFire

Really, really basic Jupyter Notebook connected to a Google Sheet and scheduled on NBFire

Recently, I have written a few articles about how easy it is to get Google Sheets to work with Jupyter Notebooks and scheduled on NBFire. This is a powerful combination that enables you to build complex, automated workflows that save you time and effort. A couple of your comments have suggested that if you are coming to this as a Google Sheet user with not much Python experience you would prefer an even more straight forward example, with even less Python. So, you asked for it, here is just about the simplest Jupyter Notebook that I could think of that works with a Google Sheet and NBFire

The workflow is

  1. Generate a random number and a date in a Jupyter Notebook.
  2. Schedule the notebook to run every night, so that it…..
  3. …. adds the data to a Google Sheet every day.

What could be simpler? But, imagine if instead of generating a random number we pulled financial data from various sources or picked up data from remote devices. You would see the results in your spreadsheet when you start work the next day. This basic workflow easily lends itself to building complex financial workflows.

I have assumed you have Jupyter running in your local machine. If not, you can find instructions on how to install it here.

So, let’s crack on. First, let’s set ourselves up with NBFire. Go to NBFire and signup for a FREE account. Then, go to the application.

Next set ourselves up on Google Cloud

Create a project

Create a project in the Google Cloud Console by clicking on create project in the console.

Enable Google Drive API and Google Sheets API

Enable the Google Drive API and Google Sheets API. In the Google Cloud Console menu, select APIs & Services and then Enabled APIs & Services.

Click on “ENABLE APIS and SERVICES” and then search for “Google Drive API” and then “Google Sheets API”. In the search results, click on the related icons as shown below. Then click on ENABLE API in the page that opens up.

Create credentials

Now that we have the Google Drive API enabled, we need to create credentials. At the top right corner of the page that opened up after enabling the API, you will see the CREATE CREDENTIALS icon and then select service account credentials.

Give it a name and click CREATE AND CONTINUE. Then when the grant pages open up, just hit CONTINUE and DONE. Click on the link in the email part.

It will take you to the service account details page. Copy the email address shown here and save it locally, we will be using it later on in this tutorial.

We also need to generate a key. Go back to the service accounts page. Click on the three dots under the actions.

Click on MANAGE KEYS and then ADD a KEY. It will ask you to choose the format, select JSON and hit CREATE. The json file will automatically be downloaded. Again, save this key somewhere locally where you can find it. We will be using it further down in the tutorial.

Create our Google Sheet.

Create a new Google Sheet and name it Simple Jupyter Sheet. Add two columns, one called Date and the other called Value.

Share our Google Sheet with our service account

Now click on “Share” on the top right hand side and enter the Google service account email address you saved earlier. Finally copy your unique ID from the URL. From the image above you can see for my Google Sheet it begins with “1zn5…” and ends with “…sPc8”. Save this ID, we are going to use it in our Jupyter Notebook.

Create our Jupyter Notebook

Open Jupyter on your local notebook and create a new Notebook. Click on FILE>SAVE AS and save the notebook as “Simple Google Sheet”.

Click on the first cell add the following code to the first cell to make sure all the Python packages we need to run the notebook are installed.

!pip install datetime
!pip install requests
!pip install gspread
!pip install json
!pip install random        

The first cell should look like this.

Now, click on the ‘+’ icon to create a new cell below. In the next cell, let’s import those packages.

from datetime import datetime
import random
import requests
import json
import gspread        

Now, click on the ‘+’ icon to create a new cell below.

In this cell we are going to bring in the Google Service account credentials we need to access the Google Sheet. We don’t want to add them to the Notebook itself because this is secret information that we don’t want to share with anybody else. So, instead we going to add them as a parameter on NBFire.

To do this we need to add a ‘parameter’s tag to this cell. Click into the new cell and then choose VIEW>CELL TOOLBAR>TAGS and in the new box add a tag called ‘parameters’ then ENTER. Your cell should now look like this.

In this cell add the following.

# To make it easier to update and share this notebook we will be entering sensitive keys are runtime using nbfire.  T
# This cell is a placeholder for these values.  This is why the cell has a 'parameters' tag added.  For more
#information please read https://papermill.readthedocs.io/en/latest/usage-parameterize.html
credentials = ""        

Your notebook should now cell look like this.

Now, click on the ‘+’ icon to create a new cell below and enter the following code. Here we are just making sure we handle the credentials correctly, if they are entered as a parameter they will be a string, but if you want to paste them into the above cell, remove the “” and they will be processed as a dict. Basically, don’t worry about this cell, or, if you are worried, post me a comment below.

# We need to handle both Google API credentials added as a string entered as a parameter on NBFire and as a dict
# entered directly in the cell above
if isinstance(credentials, dict):
    sa = gspread.service_account_from_dict(credentials)
elif isinstance(credentials, str):
    creds = json.loads(credentials)
    sa = gspread.service_account_from_dict(creds)        

Now, click on the ‘+’ icon to create a new cell below and enter the following code. You need to replace the ‘enter the id from the Google Sheet URL here’ with the URL you saved earlier.

#We open the Google Sheet via the ID in the URL and load in the worksheet we want to work with
sheet = sa.open_by_key('enter the ID from the Google Sheet URL here')
work_sheet = sheet.worksheet("Sheet1")        

Now, click on the ‘+’ icon to create a new cell below and enter the following code. This is where we actually do the work. Everything before this point is just set up and configuration.

In this cell we

  1. Generate a random number between 1 and 99999.
  2. Create today’s date.
  3. Create today as a user friendly version of today’s date.
  4. Create a new entry to be added to our Google Sheet, consisting of today’s date in the friendly format and the random number.
  5. We print our new entry so we can see what is being added in the output notebook.

#Now we simply generate our random number and today's date.  We create a new_entry object to be added to the
#worksheet.  We then append the new_entry to the worksheet as a new row
random_number = random.randint(1,99999)
now = datetime.now()
today = now.strftime('%d-%m-%Y')
new_entry = [today, random_number]
print(new_entry)        

Now, click on the ‘+’ icon to create a new cell below and enter the following code. Here we are simply adding our new entry to our worksheet

#Now we can add the "new entry data to our Google Sheet"
work_sheet.append_row(new_entry)        

Great, that is our simple Jupyter Notebook complete, save it FILE>DOWNLOAD AS and choose NOTEBOOK.

Alternatively you can download the complete notebook here

Upload our Jupyter Notebook on to NBFire

Go back to NBFIRE and go to the application. Upload your notebook here.

Schedule our Notebook

You can use the NBFire scheduler to schedule when this notebook runs. Here I have chosen it to run every hour.

Add our Google Account Credentials as the credentials parameter we added earlier.

Now you can click on FIRE SCHEDULE and your notebook will run every ho

r.

There you are. This is the simplest scheduled Notebook that I can think of that works with Google Sheets. For more advanced version check out this article. Or go here.

Wow, your integration of Jupyter Notebook with Google Sheets using NBFire shows incredible precision! Love how you're leveraging these tools. Diving deeper into Python libraries could elevate your projects even more. Have you considered exploring Pandas for data manipulation? What sector are you aiming to work in after your internship?

回复

要查看或添加评论,请登录

Philip Clarke的更多文章

社区洞察

其他会员也浏览了