登录查看更多内容

SQL Queries Quickstart: Part I

Jose Canciani

Principal Engineer at Avature | @josecanciani at ??

发布日期: 2024年4月25日

Welcome to my SQL Queries Quickstart! Designed for complete newbies or developers with some SQL knowledge, this article is your go-to for mastering the essentials of query building.

From Relational Databases to more complex concepts like JOINs and Performance Tuning, this guide will take you on the basics for you to be prolific in the art of writing queries.

Disclaimer: This is a Quickstart guide and by no means a comprehensive course. It includes the concepts that, based on my experience, I believe are essential to know before embarking on your query builder journey. After all, this is just an introductory guide, we can't cover everything! Nevertheless, whenever necessary, I'll be including useful links to resources on each specific topic.

Index

Initially I thought of creating an all-in-one guide, but the Article was just too long. So I divided the guide in four parts.

Part 1 (this article): Databases, Relational model, Tables and Columns, Data Types, Constraints, One table Query.
Part 2: Foreign Keys, Multi table Queries, Join types, Distinct.
Part 3: Subqueries, Unions, Grouping, Temporary Tables, Views.
Part 4: Performance optimizations (my favorite!)

Since this is the first part, you can find the rest in the replies of this article.

Now, let's get started!

Our playground

Throughout this article, we'll be illustrating SQL concepts and techniques using a sample application scenario. Our chosen example revolves around a user tracking system for educational courses. Let's call it Educa.

Our code style will include:

UPPERCASE for any SQL builtin keyword (like SELECT or FROM).
camelCase for identifiers (like table and column names).
Tables will be plural. At college I was told to use singular names, but I never liked it. I use aliases for table names when writing queries anyway, so I haven't found any useful reason to keep names in singular. For me, it makes more sense to use a plural (this is the users table).

Always remember that any code style and naming decision does not really matter. But it is important for any project to remain consistent with the rules you choose, to avoid confusion.

Along the way I will be adding some DB Fiddle links so you can review and play with the queries I'm showing.

Now, a -very- little bit of history.

About databases, SQL and the Relational Model

Most applications will need to store certain information and persist it over time. For example in Educa we want to store student information, track what courses they are on, store their evaluation results, and so on.

The first databases were implemented in files, for example using Comma Separated Values (CSVs). But as applications got more complex, databases evolved to become an application on its own.

The Relational Model was born on 1969 (what a year!) when Edgar F. Codd presented it to the world. It's purpose is to provide a declarative method for specifying data and queries on it. A year later SQL (Structured Query Language) was born at IBM, based on Codd's paper, as a way to standardize access to these type of databases.

We won't cover the theory of the Relational Model in this article, but we will explain some of it's concepts. If you are interested, here are some links:

An Oracle piece describing and showing benefits of a relational database
The Theory of Relational Databases, a 1983 paper in the rare case you need some maths to back this up.
The original Edgar F. Codd paper.

Tables

Relational databases store information in tables. Tables are a collection of rows (sometimes called records), each usually identified by a primary identifier or key.

Each row is composed by columns, which store the row properties.

Educa will need a list of all the courses available for the students. Let's check the courses table definition and run a basic query to get them all:

At the top we can see a DDL (Data Definition Language), a CREATE TABLE statement. These statement types are used to create and modify the structure of the database (not the data). We won't focus on DDLs in this article. But it is important to know some constraints and indexes columns may have, as they directly affect our queries. We'll review them in the next sections.

Now turn your focus to the second statement: we are executing our first query! For now, just look at the resulting table. You can see the rows (identified by an id) and the columns that represent the data stored for each one.

Each column has a datatype, and this depends on the database engine you are using. Our examples will use a MySQL-compatible database: that's why we have VARCHARs for strings or TINYINTs for booleans.

???? Play in Fiddle

Tip: we mentioned DDLs, the language to modify our database structures. Another important part of SQL are the DMLs (Data Modification Language): these statements are used to modify the data inside the tables. The most common ones are INSERT, UPDATE, DELETE and -less used- MERGE. They are outside the scope of this article, since we are focusing on SELECT queries only. I may write on them at some other point tough.

Now back to the definition.

Primary Keys

A Primary Key (aka PK) is a column that uniquely identifies a row inside the table. In our example, it is the id column .

Most PKs are usually defined as an integer field. In MySQL they are auto generated. Other DB engines require you to create a sequence and a trigger first (this is out of the scope in this article though, but you can check this Oracle-based example).

All our PKs will be auto generated. By default they are incremented by 1 each time a new row is inserted to the table.

Tip: If you are still tempted on using using user data as you PKs or maybe UUIDs, here's a good article on why you probably shouldn't.

Unique Keys

We also have a Unique Key (aka UK). Any UK could also work as a primary key, but as mentioned before, it's generally not recommended to use application data as primary ids.

For example, in the users table, we don't want two rows (two students) to have the same user name.

Tip: In most database engines, defining a PK or UK will create an implicit index. This is because the database needs the to be fast when checking the column values, for example during an insert operation.

NULLABLE option

The NOT NULL constraint is a very handy one that will require all rows to have a value in the column.

For example, in our courses table, we want all courses to have a name.

Having a required value also helps the database optimizer when finding the execution plan for our queries. We will see this in the Performance part, for now just try to avoid nullable columns whenever you can.

Indexes

Indexes are structures (usually a B-Tree) that order rows based on one or more column values. This order allows for faster retrieval of rows when the query is filtering or ordering by those values.

领英推荐

Comprehensive Guide To SQL (Structured Query Language)

Ijaz Khan 6 个月前

A not-so-good idea: Pipe Syntax In SQL

Franck Pachot 6 个月前

SQL Simplified: Key Concepts Every Beginner Must Know

Walter Shields 1 个月前

In our users table (apart from the PK and UK implicit indexes) we have two more indexes that will be used by our application when filtering rows:

by userName, for example when the user logs in
by name, when ordering results.

Tip: the fullName index is a composite index: it indexes by two or more columns. More on this on the Performance part.

Indexes and performance is a very long topic, and will be cover a bit more in part IV, but if you want to learn more, here are some useful references:

A short guide on selecting a good Index strategy by Oracle.
A list of different index types you can find in a relational database.

Our first Query

Ok, we have learn what a table is, and went through their most important properties. Now let's put this to the test by creating our first query.

Educa will have students. Let's review the users table and create a query to list them. Our first objective is bringing the full names for all non-disabled users, order by their full name. Here's our code:

???? Play in Fiddle

A SELECT query consists in at least 1 column to retrieve, and a table to read from. So both SELECT and FROM are required.

Then, there are other parts to filter (WHERE and HAVING), sort (ORDER BY), group (GROUP BY) and limit the rows.

Tip: the limit clause is not consistent with all database vendor. This is probably due to pagination being such a hard problem to get it right, performance wise. We will see recommendations in the Performance part. Here's a good article with the different options, we will be using MySQL's LIMIT clause when needed.

Let's dissect this query.

Functions

To get a student full name, we are using the CONCAT function. This function joins the last and first name, separating them with a comma.

Functions tend to differ depending on the database engine you use. The good news is that all engines have more or less the same functionality, so it's not hard to jump from one to another.

For example, here you can find information for Mysql functions. Manuals tend to categorize functions depending on the type of column they are working with: number, string, dates, etc.

Tip: if you intend to allow your code to work with multiple databases engines, then functions, along with the LIMIT clause as explained before, are probably the most important pieces to abstract out.

Filters

To filter rows from the result set, we use the WHERE clause. In the example we are requiring all rows to have the enabled column with a true value.

Filters are defined as a set of boolean conditions, so you can add as many as you want. You can use AND and OR keywords and make them as complex as you want. Use parenthesis to group conditions if your logic gets too complicated.

In this snippet we included some known builtin users, even if they are not allowed to login. Go to the Fiddle and test it out yourself!

Tip: Notice that MySQL doesn't have native boolean support, so we use an integer instead.

Apart from the usual operators like =, !=, >, <, >=, <=, SQL has some others for specific datatypes.

For example, to filter a text column, you can use a the LIKE operator along with special characters % (zero or more characters) and _ (one character). For example, to view all users whose last name starts with a C, you'll do this:

All users whose starting letter in their last name is 'C'.

Tip: Remember to check out the table and/or column collation and character settings for character-based columns. For example, if you intent to store user-generated texts, I recommend a case insensitive Unicode character set like utf8mb4 in MySQL.

NULL comparison will be discussed bellow.

IN/NOT IN allows to filter a list of values. Together with EXISTS, they allow to use subqueries for filtering. We'll see them in Part III.

Sorting

When you need to retrieve more than one row, you will usually want to sort them by some criteria. We use the ORDER BY clause for this purpose.

In our example, we are sorting by lastName. But since we may have a lot of repeated last names, we also added a secondary sorting criteria using the firstName column.

Tip: always add the primary key at the end of your sorting, when your sorting criteria is not unique. This will help you disambiguate the order of the rows, and will avoid sporadic test failures. Why? A database engine does not guarantee any order without a unique criteria, so rows could move between different query execution.

About NULL

NULL represents a missing or unknown value in a column. It's not the same as zero or an empty string; rather, it indicates the absence of any data in a particular field.

If you want to match NULL values, you can use the special IS [NOT] NULL operator.

Tip: most functions operating over a NULL value, will always return NULL. Exceptions are functions that expect NULLs, like IFNULL(). Errors such as division by zero might return NULL on certain database engines, while others might generate an error. To understand the behavior in these edge cases, check your database settings and properly document what your app needs.

We will see examples of NULL filters in the LEFT OUTER JOIN section in part II.

Before continuing

In Part II, we'll delve into multi-table queries, and they will become a bit more intricate. Therefore, I highly recommend experimenting in the Fiddle playgrounds with what we've learned thus far beforehand.

Get accustomed to querying the tables, adding orders, and using functions to transform the results. Here are some ideas to get started:

Incorporate the username into the CONCAT function, enclosed in parentheses.
Format the course date to resemble "February 1st, 2024". Then, explore how to localize it into another language.
Sort courses by start date in Descending order (hint: you'll need the DESC keyword after the column name).

Once you feel comfortable writing queries, I'll see you in Part II. Keep an eye out for it in the replies, or stay tuned to my Articles section on ??.

#sql #select #queries #database #dba #developer #DataAnalysis #tutorial #DatabaseDevelopment #SQLServer #MySQL #PostgreSQL #Oracle #SQLite

Jose Canciani

Principal Engineer at Avature | @josecanciani at ??

10 个月

Part III is online!! Check my recent post to suggest topics on the last part (Performance!). In the mean time, here's the last one. https://www.dhirubhai.net/pulse/sql-queries-quickstart-part-iii-jose-canciani-bmhhe/

Jose Canciani

Principal Engineer at Avature | @josecanciani at ??

10 个月

Part II is live! Check it out here at LinkedIn https://www.dhirubhai.net/pulse/sql-queries-quickstart-part-ii-jose-canciani-lonif/ or on ?? where I'm posting first and more often: https://x.com/josecanciani/status/1785332082622734619

Rodrigo Martin

Principal Software Engineer (Test Architect)

11 个月

Betina Shim

1 次回应

Marcelo Vasquez

CEO en Blimop | Executive MBA en IAE Business School | Mentor Minka Instituto Inclusivo de Negocios | Profesor en carrera Lic. Negocios Digitales - Univ. Austral

11 个月

congratulation José !! nice post..

1 次回应

查看更多评论

要查看或添加评论，请登录

Jose Canciani的更多文章

Can You Fool a Self-Driving Car? About Rober's video and reactions

2025年3月16日

Can You Fool a Self-Driving Car? About Rober's video and reactions

This article was inspired by @MarkRober's ecent video, "Can You Fool a Self-Driving Car?", posted on March 15, 2025…
Asus ProArt 5K Display First Impressions!

2025年3月11日

Asus ProArt 5K Display First Impressions!

OK, I’ve had this monitor for two days. It’s the PA27JCV model, and I think I can start giving my first impressions.
Privacy Unlocked: Apple vs UK Government

2025年2月26日

Privacy Unlocked: Apple vs UK Government

Apple recently dropped Advanced Data Protection (ADP) in the UK, and it’s a move that hits hard for user privacy. This…
Beyond Mocking: Towards Robust and Realistic Testing

2024年11月14日

Beyond Mocking: Towards Robust and Realistic Testing

Mocking, a popular technique in unit testing, often fails to achieve its intended goals. Instead of helping to provide…
Is Simplicity making a comeback? Pieter Levels on Lex Fridman’s Podcast

2024年8月27日

Is Simplicity making a comeback? Pieter Levels on Lex Fridman’s Podcast

In a recent conversation that has caught the attention of the tech community, Pieter Levels, known on ?? as @levelsio…
SQL Queries Quickstart: Part III

2024年5月8日

SQL Queries Quickstart: Part III

Welcome back, again! This is part III of my SQL Queries Quickstart Guide. If this is your first time here, please…
SQL Queries Quickstart: Part II

2024年4月30日

SQL Queries Quickstart: Part II

Welcome back. This is part II of my SQL Queries Quickstart Guide.
Remote Coding with VSCode & Tunnels

2024年4月16日

Remote Coding with VSCode & Tunnels

I code in #VSCode in a M1 Mac. Unfortunately the main app I develop does not run on ARM.
Programming with Early Returns

2024年4月10日

Programming with Early Returns

Writing code can be messy. But there are things you can do to make it more enjoyable.
Avature en ECI 2016: Reports reports reports

2016年7月7日

Avature en ECI 2016: Reports reports reports

Volvemos a estar presentes en ECI este a?o! Estaré hablando de como evolucionó nuestro sistema de reportes, insertados…

See all articles

SQL Queries Quickstart: Part I

Jose Canciani

Principal Engineer at Avature | @josecanciani at ??

Index

Our playground

About databases, SQL and the Relational Model

Tables

Primary Keys

Unique Keys

NULLABLE option

Indexes

领英推荐

Our first Query

Functions

Filters

Sorting

About NULL

Before continuing

Jose Canciani的更多文章

社区洞察

其他会员也浏览了

Intro to SQL

Optimize Like a Pro: SQL Query Techniques for Faster Results

SQL Server Interview Questions

SQL Essentials: Breaking Down Key Concepts for Beginners

Basic SQL Statements and Their Syntax

Mastering SQL Fundamentals (Detailed)

7 Best SQL Assignment Help Websites

Getting Started with SQL: Setting Up Your Development Environment

Exploring SQL: Definition, Benefits, and Practical Applications

SQL SELECT ... FROM ...: A Beginner's Guide

Index

Our playground

About databases, SQL and the Relational Model

Tables

Primary Keys

Unique Keys

NULLABLE option

Indexes

领英推荐

Our first Query

Functions

Filters

Sorting

About NULL

Before continuing

Jose Canciani的更多文章

Can You Fool a Self-Driving Car? About Rober's video and reactions

Asus ProArt 5K Display First Impressions!

Privacy Unlocked: Apple vs UK Government

Beyond Mocking: Towards Robust and Realistic Testing

Is Simplicity making a comeback? Pieter Levels on Lex Fridman’s Podcast