Basic SQL For Data Analysis
Hello everybody. Hope you’ve been well. Today we’ll be looking at SQL, a language that’s very important to learn when you’re working with databases. As usual, I want to give you an easy to understand 101 of both SQL and databases. With a tiny bit of effort using SQL can be pretty fun and simple. Let’s get started.
Databases: What are they?
Before we start pulling all the fancy data with a new toy, let’s understand what a database is. A database is an organized collection of data. This can be anything from financial data, students in a university, players in an online video game etc. Databases are useful for pretty much every kind of company and institution and come in many varieties.
You may have seen things like Oracle, MySQL, PostgreSQL, etc when looking up databases or a tech blog. These are called relational database management systems. In short, software that allows us to create and manage databases. In this article I’ll be using the RDMS SQLite . It is pretty easy to set up.
SQL is the language that talks to relational databases in order to store, update, change, and delete data. As to show you how it works, I’ll be creating a database in SQLite with data I created related the video game.
Creating a Database and Importing File
The first thing we’ll do is create a database. We’ll just call it testDB.
And done. The first line tells SQLite I want to make a permanent database in a specific location called testDB. The second line is just to make sure that the file has been created. Simple non? But alas our database is empty. Let’s fix that now.
Like I mentioned earlier, I created a CSV file of data related to characters in SF5. Like with R, I can upload this file into a database in order to start working with it.
In the create table statement I created a table. A table is pretty much the same thing as a data frame in R. It holds a specific set of data for us to view manipulate etc. The lines following are the names of the columns, the datatype the column will be accepting, and Not Null meaning it will not accept blank fields. Each column name is separated by a comma and the statement ends with a semicolon.
The next two lines are to let SQLite know that a CSV file will be coming and the actually importing of the file. Finally the next 2 lines are just so the output of our SQL statements will be clean.
Working With Data: SQL Statements & Queries
We can finally start playing SQL now! Hooray! Most SQL queries look something like this:
All queries end in semicolons. Queries are made up of statements (the upper case words) that structure how you want the data returned. The four most common statements are:
Also for the those not familiar with SF5 here’s what some of the columns mean.
领英推荐
Now that those definitions are out of the way let’s start by figuring how many rows are in this table:
Our first query. Looking good except one problem. There’s 26 characters in this data set. Why do we have 27 rows? Let’s investigate.
Ah I see the problem. The column name “Name” was accidentally added in as a part of the data set. Let’s put delete to work.
There we go. Now let’s answer some basic questions.
Which characters are completely new to the Street Fighter Series?
SQL statements can give you all the data you need at once if you use the * after Select. This is convenient so that you don’t have type another statement in order to figure something out. But we got another problem. Alex is not new to Street Fighter but Necalli is. Update will come to our rescue.
Fixed. There are 7 characters new to the series so far. 2 characters have less than 1000 health, 5 have projectile attacks and 3 have command grabs. Good to know.
How about we do a bit of organizing? Let’s get a list of all the characters by from lowest to highest health. We can use a clause called ORDER BY to do that. a clause is a word that comes after FROM to return data under more specific commands.
What if you only want characters with a projectile and a command grab? Then there’s the handy AND clause for that. Basically if both parts of the statement is true, you’ll get the results.
Huh, just Laura. I guess that makes her unique. I hope you got the gist of it. In SQL think of the data that you want returned to you and create the query that’ll allow you to do that. With a bit of practice you can get the hang of this in no time. Speaking of practice, here’s a a few links to some SQL tutorials. Best part about them is that they’re free and you don’t have to download a database in order to practice! How neat is that?
Well that’s all for now everyone. With the resources listed you can get started on SQL in no time at all. I might do a article in the future that shows you how you can access a database through R! It ought to be fun. If there’s something you’d like me to clear up please let me know.
If you liked this article hit the recommend button.