Introduction to SQL
1. Selecting columns
</> Onboarding | Tables
From looking at the tabs, who is the first person listed in the people table?
- Kanye West
- Biggie Smalls
- 50 Cent
- Jay Z
</> Onboarding | Query Result
Run this query in the editor and check out the resulting table in the query result tab!
SELECT name FROM people;
name
50 Cent
A. Michael Baldwin
A. Raven Cruz
...
Who is the second person listed in the query result?
- Kanye West
- A. Michael Baldwin
- 50 Cent
- Jay Z
</> Onboarding | Errors
Submit the code to the right, check out the errors, then fix them!
SELECT
'DataCamp <3 SQL'
AS result;
result
DataCamp <3 SQL
</> Onboarding | Bullet Exercises
Submit the query in the editor! Don’t worry, you’ll learn how it works soon.
SELECT 'SQL'
AS result;
result
SQL
Now change ‘SQL’ to ‘SQL is’ and click Submit!
SELECT 'SQL is'
AS result;
result
SQL is
Finally, change ‘SQL is’ to ‘SQL is cool!’ and click Submit!
SELECT 'SQL is cool!'
AS result;
result
SQL is cool!
</> Beginning your SQL journey
The table of employees might look something like this:
| id | name | age | nationality |
|---|---|---|---|
| 1 | Jessica | 22 | Ireland |
| 2 | Gabriel | 48 | France |
| 3 | Laura | 36 | USA |
How many fields does the employees table above contain?
- 1
- 2
- 3
- 4
</> SELECTing single columns
Select the title column from the films table.
SELECT title
FROM films;
title
Intolerance: Love's Struggle Throughout the Ages
Over the Hill to the Poorhouse
The Big Parade
...
Select the release_year column from the films table.
SELECT release_year
FROM films;
release_year
1916
1920
1925
...
Select the name of each person in the people table.
SELECT name
FROM people;
name
50 Cent
A. Michael Baldwin
A. Raven Cruz
...
</> SELECTing multiple columns
Get the title of every film from the films table.
SELECT title
FROM films;
title
Intolerance: Love's Struggle Throughout the Ages
Over the Hill to the Poorhouse
The Big Parade
...
Get the title and release year for every film.
SELECT title, release_year
FROM films;
title release_year
Intolerance: Love's Struggle Throughout the Ages 1916
Over the Hill to the Poorhouse 1920
The Big Parade 1925
...
Get the title, release year and country for every film.
SELECT title, release_year, country
FROM films;
title release_year country
Intolerance: Love's Struggle Throughout the Ages 1916 USA
Over the Hill to the Poorhouse 1920 USA
The Big Parade 1925 USA
...
Get all columns from the films table.
SELECT *
FROM films;
id title release_year country duration language certification gross budget
1 Intolerance: Love's Struggle Throughout the Ages 1916 USA 123 null Not Rated null 385907
2 Over the Hill to the Poorhouse 1920 USA 110 null null 3000000 100000
3 The Big Parade 1925 USA 151 null Not Rated null 245000
...
</> SELECT DISTINCT
Get all the unique countries represented in the films table.
SELECT DISTINCT country
FROM films;
country
null
Thailand
Cambodia
...
Get all the different film certifications from the films table.
SELECT DISTINCT certification
FROM films;
certification
null
Not Rated
X
...
Get the different types of film roles from the roles table.
SELECT DISTINCT role
FROM roles;
role
director
actor
</> Learning to COUNT
How many records are contained in the reviews table?
- 9,468
- 8,397
- 4,968
- 9,837
- 9,864
SELECT COUNT(*)
FROM reviews;
count
4968
</> Practice with COUNT
Count the number of rows in the people table.
SELECT COUNT(*)
FROM people;
count
8397
Count the number of (non-missing) birth dates in the people table.
SELECT COUNT(birthdate)
FROM people;
count
6152
Count the number of unique birth dates in the people table.
SELECT COUNT(DISTINCT birthdate)
FROM people;
count
5398
Count the number of unique languages in the films table.
SELECT COUNT(DISTINCT language)
FROM films;
count
47
Count the number of unique countries in the films table.
SELECT COUNT(DISTINCT country)
FROM films;
count
64
2. Filtering rows
</> Filtering results
What does the following query return?
SELECT title
FROM films
WHERE release_year > 2000;
- Films released before the year 2000
- Films released after the year 2000
- Films released after the year 2001
- Films released in 2000
</> Simple filtering of numeric values
Get all details for all films released in 2016.
SELECT *
FROM films
WHERE release_year = 2016;
id title release_year country duration language certification gross budget
4821 10 Cloverfield Lane 2016 USA 104 English PG-13 71897215 15000000
4822 13 Hours 2016 USA 144 English R 52822418 50000000
4823 A Beginner's Guide to Snuff 2016 USA 87 English null null null
...
Get the number of films released before 2000.
SELECT COUNT(title)
FROM films
WHERE release_year < 2000;
count
1337
Get the title and release year of films released after 2000.
SELECT title, release_year
FROM films
WHERE release_year > 2000;
title release_year
15 Minutes 2001
3000 Miles to Graceland 2001
A Beautiful Mind 2001
</> Simple filtering of text
Get all details for all French language films.
SELECT *
FROM films
WHERE language = 'French';
id title release_year country duration language certification gross budget
108 Une Femme Mariée 1964 France 94 French null null 120000
111 Pierrot le Fou 1965 France 110 French Not Rated null 300000
140 Mississippi Mermaid 1969 France 123 French R 26893 1600000
...
Get the name and birth date of the person born on November 11th, 1974. Remember to use ISO date format (‘1974-11-11’)!
SELECT name, birthdate
FROM people
WHERE birthdate = '1974-11-11';
name birthdate
Leonardo DiCaprio 1974-11-11
Get the number of Hindi language films.
SELECT COUNT(*)
FROM films
WHERE language = 'Hindi';
count
28
Get all details for all films with an R certification.
SELECT *
FROM films
WHERE certification = 'R';
id title release_year country duration language certification gross budget
76 Psycho 1960 USA 108 English R 32000000 806947
99 A Fistful of Dollars 1964 Italy 99 Italian R 3500000 200000
134 Rosemary's Baby 1968 USA 136 English R null 2300000
...
</> WHERE AND
Get the title and release year for all Spanish language films released before 2000.
SELECT title, release_year
FROM films
WHERE language = 'Spanish'
AND release_year < 2000;
title release_year
El Mariachi 1992
La otra conquista 1998
Tango 1998
Get all details for Spanish language films released after 2000.
SELECT *
FROM films
WHERE language = 'Spanish'
AND release_year > 2000;
id title release_year country duration language certification gross budget
1695 Y Tu Mamá También 2001 Mexico 106 Spanish R 13622333 2000000
1757 El crimen del padre Amaro 2002 Mexico 118 Spanish R 5709616 1800000
1807 Mondays in the Sun 2002 Spain 113 Spanish R 146402 4000000
...
Get all details for Spanish language films released after 2000, but before 2010.
SELECT *
FROM films
WHERE language = 'Spanish'
AND release_year > 2000
AND release_year < 2010;
id title release_year country duration language certification gross budget
1695 Y Tu Mamá También 2001 Mexico 106 Spanish R 13622333 2000000
1757 El crimen del padre Amaro 2002 Mexico 118 Spanish R 5709616 1800000
1807 Mondays in the Sun 2002 Spain 113 Spanish R 146402 4000000
...
</> WHERE AND OR
What does the OR operator do?
- Display only rows that meet at least one of the specified conditions.
- Display only rows that meet all of the specified conditions.
- Display only rows that meet none of the specified conditions.
Get the title and release year for films released in the 90s.
SELECT title, release_year
FROM films
WHERE release_year >= 1990
AND release_year < 2000;
title release_year
Arachnophobia 1990
Back to the Future Part III 1990
Child's Play 2 1990
...
Now, build on your query to filter the records to only include French or Spanish language films.
SELECT title, release_year
FROM films
WHERE (release_year >= 1990 AND release_year < 2000)
AND (language = 'French' OR language = 'Spanish');
title release_year
El Mariachi 1992
Les visiteurs 1993
The Horseman on the Roof 1995
...
Finally, restrict the query to only return films that took in more than $2M gross.
SELECT *
FROM films
WHERE (release_year >= 1990 AND release_year < 2000)
AND (language = 'French' OR language = 'Spanish')
AND gross > 2000000;
id title release_year country duration language certification gross budget
622 El Mariachi 1992 USA 81 Spanish R 2040920 7000
1149 The Red Violin 1998 Canada 130 French R 9473382 10000000
</> BETWEEN
What does the BETWEEN keyword do?
- Filter numeric values
- Filter text values
- Filter values in a specified list
- Filter values in a specified range
Get the title and release year of all films released between 1990 and 2000 (inclusive).
SELECT title, release_year
FROM films
WHERE release_year BETWEEN 1990 AND 2000;
title release_year
Arachnophobia 1990
Back to the Future Part III 1990
Child's Play 2 1990
...
Now, build on your previous query to select only films that have budgets over $100 million.
SELECT title, release_year
FROM films
WHERE (release_year BETWEEN 1990 AND 2000)
AND budget > 100000000;
title release_year
Terminator 2: Judgment Day 1991
True Lies 1994
Waterworld 1995
...
Now restrict the query to only return Spanish language films.
SELECT title, release_year
FROM films
WHERE (release_year BETWEEN 1990 AND 2000)
AND budget > 100000000
AND language = 'Spanish';
title release_year
Tango 1998
Finally, modify to your previous query to include all Spanish language or French language films with the same criteria as before. Don’t forget your parentheses!
SELECT title, release_year
FROM films
WHERE (release_year BETWEEN 1990 AND 2000)
AND budget > 100000000
AND (language = 'Spanish' OR language = 'French');
title release_year
Les couloirs du temps: Les visiteurs II 1998
Tango 1998
</> WHERE IN
Get the title and release year of all films released in 1990 or 2000 that were longer than two hours. Remember, duration is in minutes!
SELECT title, release_year
FROM films
WHERE release_year IN (1990, 2000)
AND duration > 120;
title release_year
Dances with Wolves 1990
Die Hard 2 1990
Ghost 1990
...
Get the title and language of all films which were in English, Spanish, or French.
SELECT title, language
FROM films
WHERE language IN ('English', 'Spanish', 'French');
title language
The Broadway Melody English
Hell's Angels English
A Farewell to Arms English
...
Get the title and certification of all films with an NC-17 or R certification.
SELECT title, certification
FROM films
WHERE certification IN ('NC-17', 'R');
title certification
Psycho R
A Fistful of Dollars R
Rosemary's Baby R
...
</> Introduction to NULL and IS NULL
What does NULL represent?
- A corrupt entry
- A missing value
- An empty string
- An invalid value
</> NULL and IS NULL
Get the names of people who are still alive, i.e. whose death date is missing.
SELECT name
FROM people
WHERE deathdate IS NULL;
name
50 Cent
A. Michael Baldwin
A. Raven Cruz
Get the title of every film which doesn’t have a budget associated with it.
SELECT title
FROM films
WHERE budget IS NULL;
title
Pandora's Box
The Prisoner of Zenda
The Blue Bird
Get the number of films which don’t have a language associated with them.
SELECT COUNT(title)
FROM films
WHERE language is NULL;
count
11
</> LIKE and NOT LIKE
Get the names of all people whose names begin with ‘B’. The pattern you need is ‘B%’.
SELECT name
FROM people
WHERE name LIKE 'B%';
name
B.J. Novak
Babak Najafi
Babar Ahmed
Get the names of people whose names have ‘r’ as the second letter. The pattern you need is ‘_r%’.
SELECT name
FROM people
WHERE name LIKE '_r%';
name
Ara Celi
Aramis Knight
Arben Bajraktaraj
Get the names of people whose names don’t start with A. The pattern you need is ‘A%’.
SELECT name
FROM people
WHERE name NOT LIKE 'A%';
name
50 Cent
Álex Angulo
Álex de la Iglesia
3. Aggregate Functions
</> Aggregate functions
Use the SUM function to get the total duration of all films.
SELECT SUM(duration)
FROM films;
sum
534882
Get the average duration of all films.
SELECT AVG(duration)
FROM films;
avg
107.9479313824419778
Get the duration of the shortest film.
SELECT MIN(duration)
FROM films;
min
7
Get the duration of the longest film.
SELECT MAX(duration)
FROM films;
max
334
</> Aggregate functions practice
Use the SUM function to get the total amount grossed by all films.
SELECT SUM(gross)
FROM films;
sum
202515840134
Get the average amount grossed by all films.
SELECT AVG(gross)
FROM films;
avg
48705108.257335257335
Get the amount grossed by the worst performing film.
SELECT MIN(gross)
FROM films;
min
162
Get the amount grossed by the best performing film.
SELECT MAX(gross)
FROM films;
max
936627416
</> Combining aggregate functions with WHERE
Use the SUM function to get the total amount grossed by all films made in the year 2000 or later.
SELECT SUM(gross)
FROM films
WHERE release_year >= 2000;
sum
150900926358
Get the average amount grossed by all films whose titles start with the letter ‘A’.
SELECT AVG(gross)
FROM films
WHERE title LIKE 'A%';
avg
47893236.422480620155
Get the amount grossed by the worst performing film in 1994.
SELECT MIN(gross)
FROM films
WHERE release_year = 1994;
min
125169
Get the amount grossed by the best performing film between 2000 and 2012, inclusive.
SELECT MAX(gross)
FROM films
WHERE release_year BETWEEN 2000 AND 2012;
max
760505847
</> A note on arithmetic
What is the result of SELECT (10 / 3);?
- 2.333
- 3.333
- 3
- 3.0
</> It’s AS simple AS aliasing
Get the title and net profit (the amount a film grossed, minus its budget) for all films. Alias the net profit as net_profit.
SELECT title, gross - budget AS net_profit
FROM films;
title net_profit
Intolerance: Love's Struggle Throughout the Ages null
Over the Hill to the Poorhouse 2900000
The Big Parade null
...
Get the title and duration in hours for all films. The duration is in minutes, so you’ll need to divide by 60.0 to get the duration in hours. Alias the duration in hours as duration_hours.
SELECT title, duration / 60.0 AS duration_hours
FROM films;
title duration_hours
Intolerance: Love's Struggle Throughout the Ages 2.0500000000000000
Over the Hill to the Poorhouse 1.8333333333333333
The Big Parade 2.5166666666666667
...
Get the average duration in hours for all films, aliased as avg_duration_hours.
SELECT AVG(duration) / 60.0 AS avg_duration_hours
FROM films;
avg_duration_hours
1.7991321897073663
</> Even more aliasing
Get the percentage of people who are no longer alive. Alias the result as percentage_dead. Remember to use 100.0 and not 100!
SELECT COUNT(deathdate) * 100.0 / COUNT(*) AS percentage_dead
FROM people;
percentage_dead
9.3723949029415267
Get the number of years between the newest film and oldest film. Alias the result as difference.
SELECT MAX(release_year) - MIN(release_year) AS difference
FROM films;
difference
100
Get the number of decades the films table covers. Alias the result as number_of_decades. The top half of your fraction should be enclosed in parentheses.
SELECT (MAX(release_year) - MIN(release_year)) / 10.0 AS number_of_decades
FROM films;
number_of_decades
10.0000000000000000
4. Sorting and grouping
</> ORDER BY
How do you think ORDER BY sorts a column of text values by default?
- Alphabetically (A-Z)
- Reverse alphabetically (Z-A)
- There’s no natural ordering to text data
- By number of characters (fewest to most)
</> Sorting single columns
Get the names of people from the people table, sorted alphabetically.
SELECT name
FROM people
ORDER BY name;
name
50 Cent
A. Michael Baldwin
A. Raven Cruz
...
Get the names of people, sorted by birth date.
SELECT name
FROM people
ORDER BY birthdate;
name
Robert Shaw
Lucille La Verne
Mary Carr
...
Get the birth date and name for every person, in order of when they were born.
SELECT birthdate, name
FROM people
ORDER BY birthdate;
birthdate name
1837-10-10 Robert Shaw
1872-11-07 Lucille La Verne
1874-03-14 Mary Carr
...
Get the title of films released in 2000 or 2012, in the order they were released.
SELECT title
FROM films
WHERE release_year IN (2000, 2012)
ORDER BY release_year;
title
102 Dalmatians
28 Days
3 Strikes
...
Get all details for all films except those released in 2015 and order them by duration.
SELECT *
FROM films
WHERE release_year <> 2015
ORDER BY duration;
id title release_year country duration language certification gross budget
2926 The Touch 2007 USA 7 English null null 13000
4098 Vessel 2012 USA 14 English null null null
2501 Wal-Mart: The High Cost of Low Price 2005 USA 20 English Not Rated null 1500000
...
Get the title and gross earnings for movies which begin with the letter ‘M’ and order the results alphabetically.
SELECT title, gross
FROM films
WHERE title like 'M%'
ORDER BY title;
title gross
MacGruber 8460995
Machete 26589953
Machete Kills 7268659
...
</> Sorting single columns (DESC)
Get the IMDB score and film ID for every film from the reviews table, sorted from highest to lowest score.
SELECT imdb_score, film_id
FROM reviews
ORDER BY imdb_score DESC;
imdb_score film_id
9.5 4960
9.3 742
9.2 178
...
Get the title for every film, in reverse order.
SELECT title
FROM films
ORDER BY title DESC;
title
Æon Flux
xXx: State of the Union
xXx
Get the title and duration for every film, in order of longest duration to shortest.
SELECT title, duration
FROM films
ORDER BY duration DESC;
title duration
Destiny null
Should've Been Romeo null
Hum To Mohabbat Karega null
...
</> Sorting multiple columns
Get the birth date and name of people in the people table, in order of when they were born and alphabetically by name.
SELECT birthdate, name
FROM people
ORDER BY birthdate, name;
birthdate name
1837-10-10 Robert Shaw
1872-11-07 Lucille La Verne
1874-03-14 Mary Carr
...
Get the release year, duration, and title of films ordered by their release year and duration.
SELECT release_year, duration, title
FROM films
ORDER BY release_year, duration;
release_year duration title
1916 123 Intolerance: Love's Struggle Throughout the Ages
1920 110 Over the Hill to the Poorhouse
1925 151 The Big Parade
...
Get certifications, release years, and titles of films ordered by certification (alphabetically) and release year.
SELECT certification, release_year, title
FROM films
ORDER BY certification, release_year;
certification release_year title
Approved 1933 She Done Him Wrong
Approved 1935 Top Hat
Approved 1936 The Charge of the Light Brigade
...
Get the names and birthdates of people ordered by name and birth date.
SELECT name, birthdate
FROM people
ORDER BY name, birthdate;
name birthdate
50 Cent 1975-07-06
A. Michael Baldwin 1963-04-04
A. Raven Cruz null
...
</> GROUP BY
What is GROUP BY used for?
- Performing operations by column
- Performing operations all at once
- Performing operations in a particular order
- Performing operations by group
</> GROUP BY practice
Get the release year and count of films released in each year.
SELECT release_year, COUNT(title)
FROM films
GROUP BY release_year;
release_year count
null 42
1970 12
2000 171
...
Get the release year and average duration of all films, grouped by release year.
SELECT release_year, AVG(duration)
FROM films
GROUP BY release_year;
release_year avg
null 77.4390243902439024
1970 133.1666666666666667
2000 107.5352941176470588
...
Get the release year and largest budget for all films, grouped by release year.
SELECT release_year, MAX(budget)
FROM films
GROUP BY release_year;
release_year max
null 15000000
1970 25000000
2000 140000000
...
Get the IMDB score and count of film reviews grouped by IMDB score in the reviews table.
SELECT imdb_score, COUNT(*)
FROM reviews
GROUP BY imdb_score;
imdb_score count
1.9 3
7.5 129
8.2 39
...
Get the release year and lowest gross earnings per release year.
SELECT release_year, MIN(gross)
FROM films
GROUP BY release_year;
release_year min
null 145118
1970 5000000
2000 5725
...
Get the language and total gross amount films in each language made.
SELECT language, SUM(gross)
FROM films
GROUP BY language;
language sum
null 4319281
Danish 2403857
Hebrew 5442464
...
Get the country and total budget spent making movies in each country.
SELECT country, SUM(budget)
FROM films
GROUP BY country;
country sum
null 3500000
Thailand 909000000
Cambodia null
...
Get the release year, country, and highest budget spent making a film for each year, for each country. Sort your results by release year and country.
SELECT release_year, country, MAX(budget)
FROM films
GROUP BY release_year, country
ORDER BY release_year, country;
release_year country max
1916 USA 385907
1920 USA 100000
1925 USA 245000
...
Get the country, release year, and lowest amount grossed per release year per country. Order your results by country and release year.
SELECT release_year, country, MIN(gross)
FROM films
GROUP BY release_year, country
ORDER BY country, release_year;
release_year country min
2003 Afghanistan 1127331
2000 Argentina 1221261
2004 Argentina 304124
...
</> HAVING a great time
In how many different years were more than 200 movies released?
- 2
- 13
- 44
- 63
SELECT release_year
FROM films
GROUP BY release_year
HAVING COUNT(title) > 200;
[Showing 13 out of 13 rows]
</> All together now
Get the release year, budget and gross earnings for each film in the films table.
SELECT release_year, budget, gross
FROM films;
release_year budget gross
1916 385907 null
1920 100000 3000000
1925 245000 null
...
Modify your query so that only records with a release_year after 1990 are included.
SELECT release_year, budget, gross
FROM films
WHERE release_year > 1990;
release_year budget gross
1991 6000000 869325
1991 20000000 38037513
1991 6000000 57504069
...
Remove the budget and gross columns, and group your results by release year.
SELECT release_year
FROM films
WHERE release_year > 1990
GROUP BY release_year;
release_year
2000
2013
2007
...
Modify your query to include the average budget and average gross earnings for the results you have so far. Alias the average budget as avg_budget; alias the average gross earnings as avg_gross.
SELECT release_year, AVG(budget) AS avg_budget, AVG(gross) AS avg_gross
FROM films
WHERE release_year > 1990
GROUP BY release_year;
release_year avg_budget avg_gross
2000 34931375.757575757576 42172627.580838323353
2013 40519044.915492957746 56158357.775401069519
2007 35271131.147540983607 46267501.022346368715
...
Modify your query so that only years with an average budget of greater than $60 million are included.
SELECT release_year, AVG(budget) AS avg_budget, AVG(gross) AS avg_gross
FROM films
WHERE release_year > 1990
GROUP BY release_year
HAVING AVG(budget) > 60000000;
release_year avg_budget avg_gross
2005 70323938.231527093596 41159143.290640394089
2006 93968929.577464788732 39237855.953703703704
Finally, modify your query to order the results from highest average gross earnings to lowest.
SELECT release_year, AVG(budget) AS avg_budget, AVG(gross) AS avg_gross
FROM films
WHERE release_year > 1990
GROUP BY release_year
HAVING AVG(budget) > 60000000
ORDER BY avg_gross DESC;
release_year avg_budget avg_gross
2005 70323938.231527093596 41159143.290640394089
2006 93968929.577464788732 39237855.953703703704
Get the country, average budget, and average gross take of countries that have made more than 10 films. Order the result by country name, and limit the number of results displayed to 5. You should alias the averages as avg_budget and avg_gross respectively.
SELECT country, AVG(budget) AS avg_budget, AVG(gross) AS avg_gross
FROM films
GROUP BY country
HAVING COUNT(title) > 10
ORDER BY country
LIMIT 5;
country avg_budget avg_gross
Australia 31172110.460000000000 40205909.571428571429
Canada 14798458.715596330275 22432066.680555555556
China 62219000.000000000000 14143040.736842105263
Denmark 13922222.222222222222 1418469.111111111111
France 30672034.615384615385 16350593.578512396694
</> A taste of things to come
Submit the code in the editor and inspect the results.
SELECT title, imdb_score
FROM films
JOIN reviews
ON films.id = reviews.film_id
WHERE title = 'To Kill a Mockingbird';
title imdb_score
To Kill a Mockingbird 8.4
What is the IMDB score for the film To Kill a Mockingbird?
- 8.1
- 8.4
- 7.7
- 9.3
本文从SQL的基本操作开始,包括选择列、过滤行、排序和分组等,详细介绍了如何使用SQL进行数据查询和分析。通过实际例子,读者可以学习到如何筛选特定条件的数据,计算统计数据,以及对数据进行排序和分组。
158

被折叠的 条评论
为什么被折叠?



