×
By the end of this chapter, you should be able to:
ORDER BY
to categorize resultsGROUP BY
to aggregate data into sub groupsCASE
statements for custom conditional outputLet's start with the following SQL in psql
:
DROP DATABASE IF EXISTS sports; -- in case you copy this whole set of commands multple times CREATE DATABASE sports;
\c sports
to connect to newly-created sports database.
Now to seed the data:
CREATE TABLE players (id SERIAL PRIMARY KEY, name TEXT, sport TEXT, team TEXT, jersey_number INTEGER, is_rookie BOOLEAN); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Martin', 'hockey', 'devils', 12, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('David', 'baseball', 'mets', 2, true); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('David', 'soccer', 'galaxy', 17, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Elie', 'baseball', 'cobras', 8, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Lisa', 'basketball', 'sparks', 44, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Seabass', 'frisbee', 'jumbos', 44, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Sue', 'basketball', 'lynx', 54, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Candace', 'sparks', 'cobras', 49, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Swin', 'basketball', 'shock', 1, true); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Gilbert', 'basketball', 'warriors', 0, false); INSERT INTO players (name, sport, team, jersey_number, is_rookie) VALUES ('Maria', 'tennis', 'none', 24, false);
The building block for these operators is the WHERE
clause which comes after any operation like SELECT
, UPDATE
or DELETE
SELECT * FROM players WHERE sport = 'basketball'; /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+----------+---------------+----------- 5 | Lisa | basketball | sparks | 44 | f 7 | Sue | basketball | lynx | 54 | f 9 | Swin | basketball | shock | 1 | t 10 | Gilbert | basketball | warriors | 0 | f */
To find multiple records we can search for multiple terms using IN
SELECT * FROM players WHERE jersey_number IN (0,1); /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+----------+---------------+----------- 9 | Swin | basketball | shock | 1 | t 10 | Gilbert | basketball | warriors | 0 | f */
We can do the exact inverse of IN
using NOT IN
SELECT * FROM players WHERE jersey_number IN (0,1); /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+--------+---------------+----------- 1 | Martin | hockey | devils | 12 | f 2 | David | baseball | mets | 2 | t 3 | David | soccer | galaxy | 17 | f 4 | Elie | baseball | cobras | 8 | f 5 | Lisa | basketball | sparks | 44 | f 6 | Seabass | frisbee | jumbos | 44 | f 7 | Sue | basketball | lynx | 54 | f 8 | Candace | sparks | cobras | 49 | f 11 | Maria | tennis | none | 24 | f */
To search for a field in a range, we can use BETWEEN x AND y
.
SELECT * FROM players WHERE jersey_number BETWEEN 0 AND 25; /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+----------+---------------+----------- 1 | Martin | hockey | devils | 12 | f 2 | David | baseball | mets | 2 | t 3 | David | soccer | galaxy | 17 | f 4 | Elie | baseball | cobras | 8 | f 9 | Swin | basketball | shock | 1 | t 10 | Gilbert | basketball | warriors | 0 | f 11 | Maria | tennis | none | 24 | f */
SQL supports all kinds of arithmetic operators like <
, <=
, >=
, >
and !=
.
SELECT * FROM players WHERE jersey_number > 25; /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+--------+---------------+----------- 5 | Lisa | basketball | sparks | 44 | f 6 | Seabass | frisbee | jumbos | 44 | f 7 | Sue | basketball | lynx | 54 | f 8 | Candace | sparks | cobras | 49 | f */
To check that multiple conditions are satisfied we can use the AND
command.
SELECT * FROM players WHERE jersey_number > 25 and id < 6; /* id | name | sport | team | jersey_number | is_rookie ----+------+------------+--------+---------------+----------- 5 | Lisa | basketball | sparks | 44 | f */
To check that at least one condition is satisfied we can use the OR
command.
SELECT * FROM players WHERE jersey_number > 25 or id < 6; /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+--------+---------------+----------- 1 | Martin | hockey | devils | 12 | f 2 | David | baseball | mets | 2 | t 3 | David | soccer | galaxy | 17 | f 4 | Elie | baseball | cobras | 8 | f 5 | Lisa | basketball | sparks | 44 | f 6 | Seabass | frisbee | jumbos | 44 | f 7 | Sue | basketball | lynx | 54 | f 8 | Candace | sparks | cobras | 49 | f */
To search for a term we can use the LIKE
command. The %
denotes any possible character. LIKE
is case sensitive.
--find all players whose name starts with a capital S-- SELECT * FROM players WHERE name LIKE 'S%'; /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+--------+---------------+----------- 6 | Seabass | frisbee | jumbos | 44 | f 7 | Sue | basketball | lynx | 54 | f 9 | Swin | basketball | shock | 1 | t */
ILIKE
is similar to the LIKE
command, but it is case insensitive.
SELECT * FROM players WHERE name ILIKE 's%'; /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+--------+---------------+----------- 6 | Seabass | frisbee | jumbos | 44 | f 7 | Sue | basketball | lynx | 54 | f 9 | Swin | basketball | shock | 1 | t */
If we want to order results in ascending or descending order we use the ORDER BY ASC_or_DESC
command.
SELECT * FROM players ORDER BY jersey_number DESC; /* id | name | sport | team | jersey_number | is_rookie ----+---------+------------+----------+---------------+----------- 7 | Sue | basketball | lynx | 54 | f 8 | Candace | sparks | cobras | 49 | f 5 | Lisa | basketball | sparks | 44 | f 6 | Seabass | frisbee | jumbos | 44 | f 11 | Maria | tennis | none | 24 | f 3 | David | soccer | galaxy | 17 | f 1 | Martin | hockey | devils | 12 | f 4 | Elie | baseball | cobras | 8 | f 2 | David | baseball | mets | 2 | t 9 | Swin | basketball | shock | 1 | t 10 | Gilbert | basketball | warriors | 0 | f */
Commonly in SQL, we will want to output a certain data type for an operation, to convert one data type to another we can use the CAST
command or use ::data_type
SELECT round((SUM(id) / COUNT(jersey_number))::numeric, 2)::float
Very commonly, we will want to take multiple values in a table and group them into sub-categories or a single category based on an aggregate function. Let's look at a few aggregate functions, but first - some sample data:
CREATE TABLE sales (id SERIAL PRIMARY KEY, product TEXT, customer_name TEXT, price REAL, quantity SMALLINT); INSERT INTO sales (product, customer_name, price, quantity) VALUES ('Chair', 'Elie', 99.99, 1); INSERT INTO sales (product, customer_name, price, quantity) VALUES ('Table', 'Tim', 250.00, 1); INSERT INTO sales (product, customer_name, price, quantity) VALUES ('Chair', 'Matt', 49.99, 3); INSERT INTO sales (product, customer_name, price, quantity) VALUES ('Table', 'Janey', 1000.00, 2); INSERT INTO sales (product, customer_name, price, quantity) VALUES ('Chair', 'Janey', 300.00, 2); INSERT INTO sales (product, customer_name, price, quantity) VALUES ('Table', 'Tim', 2200.00, 2); INSERT INTO sales (product, customer_name, price, quantity) VALUES ('Bookshelf', 'Elie', 1200.00, 2);
You can read more about these data types here.
Now that we have some sample data, let's examine a few common aggregate functions which collect multiple pieces of data and return a single value.
To count the number of occurances we use the COUNT
function.
SELECT COUNT(*) FROM sales; SELECT COUNT(*) FROM sales WHERE product = 'Chair';
To figure out the sum we can use the SUM
function and even round numbers using the ROUND
function as well.
SELECT SUM(price) FROM sales; SELECT ROUND(SUM(price)) FROM sales;
To find the minimum value in a data set we use the MIN
function.
SELECT MIN(price) FROM sales;
To find the maximum value in a data set we use the MAX
function.
SELECT MAX(price) FROM sales;
To find the average value in a data set we use the AVG
function. We can attach the AS
command to alias the column name.
SELECT AVG(price) AS max_count FROM sales;
Now that we have seen a couple aggregate functions, lets take some information.
SELECT product, COUNT(product) FROM sales GROUP BY product;
When using a GROUP BY
clause, we can not attach a WHERE
if we want to be more selective. Instead we use the HAVING
keyword to place condition on our GROUP BY
command.
SELECT product, COUNT(product) FROM sales GROUP BY product HAVING COUNT(product) > 2;
If we only want to find unique values in a column, we can use DISTINCT
, we can also do this for pairs of columns separated by a comma.
SELECT DISTINCT customer_name FROM sales;
In SQL, we can use conditional logic to query our data and display custom results based off of the condition.
SELECT product, price, CASE WHEN price < 50 THEN 'inexpensive' WHEN price > 50 AND price < 100 THEN 'reasonable' WHEN price < 50 AND price < 400 THEN 'expensive' ELSE 'very expensive' END AS how_expensive FROM sales;
When you're ready, move on to SQL Introduction Exercises