Music App Database
Let’s imagine we’re building a new music app, where listeners can play songs and keep track of which songs they played and when. Five beta testers have been using a prototype of the app; their data has been saved to a few CSV files, and now we want to design and use a database that can power the app.
Fork this SQL Studio template to follow along:
Looking Around
After you fork the template, click around in the files and see if you can determine what each file does and why it’s there. Make sure the sidebar on the left is open, as shown in the picture to the right.
main.sql- This is the file you’ll use to run all your queries. (You only need to run this file for the exercise!)
- We will be working in this file to practice our SQL skills.
- The SQL files:
import_albums.sqlimport_artists.sqlimport_listeners.sqlimport_plays.sqlimport_songs.sql- These files create the tables and load the data from their corresponding CSV files.
- Take some time to read through each of the
create tableand.importstatements. Pay attention to the data types and column names. - Discuss: Why do you think the
.importcomes aftercreate table? - Something new here! Some tables, such as the
albumstable, contain something called a foreign key. See the section below on what this is.
- The CSV files:
albums.csvartists.csvlisteners.csvplays.csvsongs.csv- “CSV” stands for “comma-separated values.” Each line in a CSV document represents one record in the table. The different fields are separated by commas, hence the name.
- These files contain the data that we will load into the SQLite database. Once we do that, we can run queries against the data.
- Notice that the CSV header row (the first row) contains the same column names as the columns we create for each SQL table.
Foreign Keys
A foreign key constraint is a rule that helps keep our data accurate and consistent. It ensures that a column in one table only contains values that exist in a specific column of another table.
For example, the
songs.album_id column has a foreign key constraint that requires every value in that column to also exist in the albums.album_id column.This means:
- If we try to insert a song with an
album_idthat isn’t in thealbumstable, we’ll get an error.
- If we try to delete an album from the
albumstable while there are songs referencing thatalbum_id, we’ll also get an error.
This constraint prevents data that is incomplete or doesn’t make sense and keeps relationships between tables valid.
erDiagram ALBUM ||--|{ SONG : has ALBUM { integer album_id PK text title integer artist_id FK date release_date } SONG { integer song_id PK text title integer album_id FK }
Code-Along | Music App Database
The Setup Section in main.sql
We don’t need to change any code here. Let’s look at what each step does.
Setup Step #1: The
drop table if exists commands allow you to run the same script multiple times without causing errors. When rerunning main.sql, you need the commands because the tables you created in the previous “Run” session are still stored behind the scenes. Starting with drop table if exists ensures a clean slate by deleting those old versions so your script can rebuild them from scratch every time you hit "Run."Because this database deals with foreign keys, the order of the
drop statements matter. You must drop the child tables before the parent tables because a parent table cannot be deleted if a child table is still linked to it.Setup Step #2: The commands here are called “dot commands.” They are different from queries. Instead of interacting directly with data in a table,
.read runs other SQL files. Since this database deals with foreign keys, the order of the .read statements matter. The parent tables must be read before the child tables (e.g. artists must be read before albums).Code | Purpose |
.read import_{csvname}.sql | Reads the SQL queries in the import_{csvname}.sql file and runs them. |
The import SQL files
Let’s take a look at the
import_artists.sql file that is creating a table and importing artist data:drop table if exists artists; create table artists( artist_id integer primary key, name text, country text, debut_year integer, genre text ); -- Import data using the .import command -- This will load all 14 rows from artists.csv into the table .import --skip 1 artists.csv artists -- Example of what the first 5 rows would look like if inserted manually: -- insert into artists values -- (1, 'Queen', 'United Kingdom', 1971, 'Rock'), -- (2, 'Rihanna', 'Barbados', 2005, 'R&B'), -- (3, 'Eminem', 'United States', 1996, 'Hip-hop'), -- (4, 'Taylor Swift', 'United States', 2006, 'Pop'), -- (5, 'Whitney Houston', 'United States', 1977, 'R&B');
It looks like we have:
drop table if exists artists: creates a clean slate when rerunning the file (you won’t ever need to run the import files in this lab becausemain.sqlhandles it!)
create table artists: builds the table the CSV data is read into
.import --skip 1 artists.csv artists- Isn’t everything after
.importa comment? It sure looks like one, but the syntax for dot commands is different than queries! Even though SQL Studio colors this portion of the line as if it’s a comment, it is actually a list of arguments that specifies: - To skip the first row of the CSV during import, since that’s the header row.
- To import the data from
artists.csv - To load the imported data into the table
artists
- A feature of SQL Studio! When you import CSV data, SQL Studio will add the CSV file and create an import SQL file that looks exactly like the import files in this lab!
Try-It | Explore the Database
In
main.sql, write a select * query for each of the five tables (artists, albums, songs, listeners, plays) to see what data are available. As you explore, think about how the tables connect to each other. For example, how would you figure out which artist made a particular song?🤖 AI Connection
After exploring the database structure, test your understanding. Without looking at the files, describe the tables and their relationships in your own words. Then ask an AI tool: "I have a music app database with tables for artists, albums, songs, listeners, and plays. What questions could I answer by combining data from two or more of these tables?" Pick one question the AI suggests and think about which tables and fields you'd need to answer it. Does the suggestion actually make sense given the fields available in your database?
Querying the Music App Database
Now that you've explored the database and started thinking about the kinds of questions you can ask, it's time to put your SQL skills to work! Fork the SQL Studio template and work through the code-alongs in
main.sql. Each one poses a question about the music app data. Your job is to replace the placeholder query with SQL that answers it. The questions start simple and build on each other, so by the end you'll be writing queries that combine data across multiple tables!Â
For a summary of this lesson, check out the 4. Lab: Music App Database One-Pager!
Â