登录查看更多内容

SQL Tips and Tricks

Guilherme Lisb?a

Senior Data Engineer | Google Cloud Certified | Python | GCP | Bigquery | Dataflow

发布日期: 2023年3月28日

+ 关注

Here we'll propose a advantage use cases to improve your query performance and make your query less complex.

Common Table Expressions ( CTEs )

CTEs allow us to break down complex queries into smaller ones, we can reuse those query in multiples times within a single query.

We can stop use concatenate queries instead CTEs

SELECT *

FROM

(

 SELECT *

 FROM users

 WHERE sex = 'F'

) men

INNER JOIN

(

 SELECT *

 FROM films

 WHERE total>100

) catalog

ON men.id = catalog.user_id*

We can replace the query above to

WITH user AS 
 SELECT *
 FROM users
 WHERE sex = 'M'
),
catalog AS (
 SELECT *
 FROM films
 WHERE total>100
)
SELECT *
FROM user
INNER JOIN catalog ON user.id = catalog.user_id

Window Functions ( WF )

Window functions are a useful technique that allows for executing analytical functions and aggregations on specific data windows. With window functions, we can calculate metrics such as moving averages or growth rates in a time series. Additionally, window functions also allow for specifying the sorting order of data in a query and defining the window based on values in other columns. This makes it easier to analyze data and gain valuable insights.

WF allows you to make calculations across rows without needing self-joins, that is a good alternative to avoid self-join and multiples querys.

SELECT customer_id, amount,
 SUM(amount) 
  OVER (
        PARTITION BY customer_id ORDER BY films_date
  ) AS amount_collected
 FROM films

In the above example we some all amount e put the result in a column in that query we don't need to make a group.

Pivot Tables

That approach let us to turno rows into columns, it's help us in some situations

Imagine you have that table:

领英推荐

How to Find the Minimum Value in SQL (with Examples)

StrataScratch 1 个月前

LeetCode Curated SQL Solutions and Discussion - Week 2

Ankur Ranjan 3 年前

Understanding Data Types in C++: A Beginner's…

Reza Shahin, Ph.D. 3 个月前

+--------+----------+-------
| region |   year   | sales |
+--------+----------+-------+
| North  | 2010     | 100   |
| North  | 2011     | 200   |
| South  | 2010     | 150   |
| South  | 2011     | 75    |
| East   | 2010     | 50    |
+--------+----------+-------++

And we can make a analises to know how the sales how in each year, we can pivot that table into

+--------+----------+----------
| region | 2010     | 2011     |
+--------+----------+----------+
| East   | 50       | NULL     |
| North  | 100      | 200      |
| South  | 150      | 75       |
+--------+----------+----------++

A example of that query could be

WITH pivot_data AS 
 SELECT region, year, sales
 FROM sales_data
)
SELECT *
FROM pivot_data
PIVOT (SUM(sales) FOR year IN ([2010], [2011])) AS pivot_table(

Wild Table

Wild Table is a powerful technique that allows for querying multiple tables without the need to join or merge them. With this technique, we can access multiple tables without creating a temporary table to store the results. Instead, Wild Table uses wildcard syntax, which allows BigQuery to filter tables based on name patterns. For example, if we have several sales tables with names like "sales_2018", "sales_2019", "sales_2020", we can query all of these tables at once with a single query by using Wild Table and specifying the pattern "sales_*".

Federated Tables

Federated tables are an advanced SQL technique that allows for querying data from external sources such as Google Cloud Storage or Google Drive without the need to import the data into BigQuery. With federated tables, we can access data from external sources and incorporate them into our SQL queries. This can be useful for accessing historical data or data that is not stored in BigQuery. By using federated tables, we can save time and resources on data transfer and import.

Materialized Views

Materialized views are an advanced SQL technique that allows for precomputing and storing the results of a query. With materialized views, we can define a query that generates a result set, which can then be stored as a table. This can be useful for optimizing query performance by avoiding the need to repeatedly execute a complex query.

Unnest + Generate_array:

Unnest and Generate_array are advanced SQL techniques that allow for generating arrays and unnesting them into individual rows. With these techniques, we can transform data in a way that makes it easier to analyze and gain valuable insights. This can be useful for working with complex data structures such as arrays or JSON objects.

Those are some Technics that will help you to improve your queries.

要查看或添加评论，请登录

Guilherme Lisb?a的更多文章

Dataform

2023年9月29日

Dataform

Dataform é uma ferramenta de gerenciamento e transforma??o de dados que permite aos analistas construir e manter…
How to Use Advanced Techniques in BigQuery to Optimize Your Queries

2023年4月4日

How to Use Advanced Techniques in BigQuery to Optimize Your Queries

BigQuery is a data analytics tool from Google that allows users to perform complex queries on large datasets with ease.…
ETL x ELT

2023年2月15日

ETL x ELT

When we think in how to bring data from data sources and databases into a lakehouse we can use ETL or ELT. Extract :…
RDD vs DataFrame vs DataSet

2022年4月6日

RDD vs DataFrame vs DataSet

Talking about Spark there are some things we need to know. The first is about RDD, DataFrame, and DataSet.
Ducks Pattern

2021年6月20日

Ducks Pattern

The Ducks Pattern is a way to organize the Redux structure and the Redux is a library of state control based on Flux…
Root import

2021年1月26日

Root import

The root import is a tool that helps us to easily import a module from anywhere that is. React Js First of all, you…
Por que usar Typescript ?

2020年12月10日

Por que usar Typescript ?

Essa é a pergunta que eu sempre escuto sempre que estou sugerindo que seja realizado usando Typescript. Por que usar…
How to use Redux with AsyncStorage

2020年12月4日

How to use Redux with AsyncStorage

First of all, may you’re wondering why should you or anyone use the Redux with AsyncStorage, because is easier to use…

See all articles

SQL Tips and Tricks

Guilherme Lisb?a

Senior Data Engineer | Google Cloud Certified | Python | GCP | Bigquery | Dataflow

Common Table Expressions ( CTEs )

Window Functions ( WF )

Pivot Tables

领英推荐

Wild Table

Federated Tables

Materialized Views

Unnest + Generate_array:

Guilherme Lisb?a的更多文章

社区洞察

其他会员也浏览了

Understanding Window Functions in SQL

Clustered vs Non-Clustered Indexes

Data Analytics: The Art of Writing Efficient SQL Queries

VIEW vs TABLE in SQL | Belayet Hossain

The Union Join (SQL)

A Guide to COALESCE() and ISNULL() Functions with Hands-On Examples

Comparing SQL Subqueries and Window Functions: Differences, Use Cases, and Performance Considerations

SQL JOINS (Inner, Left, Right and Full Joins)

Null-Duplicate=Joins

Subqueries and joining tables - your indispensable SQL tools

Common Table Expressions ( CTEs )

Window Functions ( WF )

Pivot Tables

领英推荐

Wild Table

Federated Tables

Materialized Views

Unnest + Generate_array:

Guilherme Lisb?a的更多文章

Dataform

How to Use Advanced Techniques in BigQuery to Optimize Your Queries

ETL x ELT

RDD vs DataFrame vs DataSet

Ducks Pattern

Root import

Por que usar Typescript ?

How to use Redux with AsyncStorage

社区洞察

其他会员也浏览了

Understanding Window Functions in SQL

Clustered vs Non-Clustered Indexes

Data Analytics: The Art of Writing Efficient SQL Queries

VIEW vs TABLE in SQL | Belayet Hossain

The Union Join (SQL)

A Guide to COALESCE() and ISNULL() Functions with Hands-On Examples

Comparing SQL Subqueries and Window Functions: Differences, Use Cases, and Performance Considerations

SQL JOINS (Inner, Left, Right and Full Joins)

Null-Duplicate=Joins

Subqueries and joining tables - your indispensable SQL tools