BigQuery Chapter 2A: Interacting with the Platform

BigQuery Chapter 2A: Interacting with the Platform

Hello All, in this edition of Learning BQ, we will focus on Interacting with BigQuery console, and creating the building blocs within BQ. You can say that this is where the practical exposure would start, and in the next article, it will gain momentum. In this article we are going to talk about:

  1. Costing of BQ
  2. Building Blocs of BQ:
  3. Querying within BQ

As always, before you proceed any further, remember that this is 5th article of the series, and it would really make a lot of sense if you follow the series from article 1, rather than from anywhere in between.

  1. Power of Statistics
  2. Introduction to BQ Series
  3. BQ Chapter 1A
  4. BQ Chapter 1B

Now that you are all caught up, let’s understand how do we interact with the BQ platform. As we discussed earlier, there are 3 basic ways of accessing BigQuery, which if you remember is nothing but a way to interact with Dremel:

  • Web UI
  • Command Line tool
  • APIs

For our learning we will only focus on Web UI. To start accessing the BQ UI, first we need to create a project in GCP, Google Cloud Platform, by clicking here. Once you have created the account, we need to enable billing here, and before we do that, let me tell you about the billing details of BQ:

1)     As and when you sign up with your account, Google very generously adds $300 in your account for trial purposes. But you will still have to enter your credit card details to avail the offer

2)     Now let’s understand the pricing structure, since you are putting your card details:

a.      $0.2 for every 1 GB of data, where first 10 GB is free (Active Storage)

b.      $0.1 for every 1 GB of data, where first 10 GB is free (Long term storage)

c.      $0.1 for every 200 MB of data, in case of streaming inserts, where you will be charged only for successful insertion of data

d.      $5 for every 1 TB of data for querying, where first 1 TB is free.

3)     For more details please go through this link

Considering you have just created a project, you will have to enable billing for your project:

1)     Click on top left corner menu button to open a navigation panel and then click on billing-

2)     Create a new billing account, enter your details, including the card details and finally click on Submit and Enable billing for your GCP project.

Since BQ is already enabled on the new accounts, so you wouldn’t have to worry about enabling the BQ services.

Now that you are ready to jump into the console, again click on top left corner menu button, scroll down on the left navigation panel and click on BigQuery

And welcome to your BQ interface

Now there’s a probability that you might land in a different UI, the classic BQ UI, worry not, just click on the new UI button on top right corner and you will be there in the same console as above:

Building Blocs:

1)     Dataset: These are the top Level containers used to organize and control access to your data. Tables and Views are created within datasets for us to play around with the data. You are never charged to create or update the datasets.

2)     Tables: BQ contains all records organized in rows. Each record is composed of columns, also known as fields. Every table is defined by schema, column names, data types and other information. As a best practice specify the schema of the table when it is created.

3)     Views: These are virtual tables defined by SQL queries. It contains data only from tables and fields as specified in the query.

Now let’s learn about how to use them:

Datasets: Whenever you create a dataset, you need to specify it’s location. After you create the dataset, you can’t change the location, so please be very sure of the location. If you are querying an external dataset, like Google Cloud Storage, make sure that your location of both, GCS and BQ are the same.

Click on Create Dataset as in the picture above, put up the dataset ID, location, and expiration time. For learning purpose you can keep it as default, 60 days after table creation. Once all the values are set, click on the Create Dataset button as below:

Tables: Create Table within a Dataset by first selecting your dataset in the left panel, and then clicking on Create Table as shown below:

Now create a Table as you want, it could be an Empty Table, Uploaded from your local system, from Google Cloud Storage, or your Google Drive. Remember to specify the Table Schema by adding fields, their Data Types before you click on Create Table button as shown below:

Views: Creating and using views is extremely simple, just run the query that you might want to, for example, as below, you can write a simple query:

SELECT name, station_id as id, location FROM `bigquery-public-data.austin_bikeshare.bikeshare_stations` LIMIT 1000

Here what you have to notice is that ‘bigquery-public-data’ is the project ID, ‘austin_bikeshare’ is the dataset, and ‘bikeshare_stations’ is the Table name from where we are querying.

Once you have your result, just click on saving the view as shown above. This will open a window for you to save the view in the desired location.

Querying within BQ:

Now this is the easy part, all you have to do is, know a bit of SQL, and then fly away. For example I have written a small query to join two tables as below:

Table A: ID_Table

Table B: Title

All you have to do is, upload the tables as CSV as described in the above sections, and then write a query like below:

SELECT a.id, a.first_name, a.last_name, b.title FROM `sampleproject-220310.Checkagain.id_table` a LEFT JOIN `sampleproject-220310.Checkagain.title` b on a.id = b.id

To get a LEFT OUTER Join output as:

For those who have never worked on SQL, I did create a document long time back for some friends who had no idea about SQL but wanted to learn basics. I am putting the link here, and I truly do believe that it will help you scale up.

Click here to view the SQL Doc

This is where we will take a break for this week, and next week we will write some complex queries within BQ, and will try to integrate the same with Google Analytics, SFDC and maybe your CRM data files, so that we can create a complete customer profile within the BQ platform itself for further analysis.

Hope you are liking the series, in case you want me to expand on some aspects, please do feel free to let me know through comments.

要查看或添加评论,请登录

Kunal Mehta的更多文章

  • Key Terms for Gen AI

    Key Terms for Gen AI

    Hello Everyone, as we are exploring the field of Gen AI, we have already gone through 2 ‘chapters’ so to say. First was…

    2 条评论
  • Introduction to LangChain

    Introduction to LangChain

    LangChain is a framework in place to develop LLM applications. The best part about this framework is that it supports…

    11 条评论
  • Making sense of the biggest lockdown of the world with numbers

    Making sense of the biggest lockdown of the world with numbers

    The purpose of the article is 3 folds, 1) To understand the COVID-19 impact in India, in terms of number of cases and…

    61 条评论
  • Marketing Strategies for Hotel Industry: Leveraging GCP to boost your strategies

    Marketing Strategies for Hotel Industry: Leveraging GCP to boost your strategies

    I have always been asked whether Google Cloud Platform’s auto ML options, primarily BigQuery ML, can come close to the…

    6 条评论
  • Understanding Polynomial Linear Regression

    Understanding Polynomial Linear Regression

    Whenever we talk about regression, more often than not people assume simple linear regression, an equation which seems…

  • Adobe Analytics comes to Google Cloud Platform

    Adobe Analytics comes to Google Cloud Platform

    Two of the biggest webanalytics solution providers have been slugging it out for far too long it seems, and now…

    13 条评论
  • BigQuery: Chapter 1B - Intro Continued

    BigQuery: Chapter 1B - Intro Continued

    This time we are going to talk about the real reasons why we, as marketers, need to use data analytics, that too with…

    3 条评论
  • BigQuery: Chapter 1A - An Introduction

    BigQuery: Chapter 1A - An Introduction

    Hello Everyone, so, let’s start discussion on BigQuery today, the platform, where it’s coming from, it’s architecture…

    4 条评论
  • Google BigQuery: Unlocking the power of Google Cloud Platform

    Google BigQuery: Unlocking the power of Google Cloud Platform

    More than an year ago, I started writing how marketers can utilize the power of Statistics in their workstream, and…

    4 条评论
  • Why Time Travel would be difficult?

    Why Time Travel would be difficult?

    I know that all of us have asked this question, not to anyone else, but to ourselves..

    2 条评论

社区洞察

其他会员也浏览了