6. Lab: Women on High Courts

🎯 Learning Goals

Apply SQL skills from previous lessons to explore and clean a real-world dataset

Identify and fix data anomalies using SQL queries

Calculate descriptive statistics using aggregate functions in SQL

Export cleaned data from SQL Studio to CSV format

📗 Technical Vocabulary

Data cleaning

Data anomaly

Aggregate functions (AVG, COUNT, etc.)

Data visualization

The Women on High Courts Data Set

In this lab, we will be working with the Women on High Courts data set. This data set was assembled by researchers interested in tracking how many women serve on judicial high courts throughout the world. There are three kinds of high courts represented in the dataset:

Appellate Courts

These courts review decisions made by lower courts (trial courts) to determine if legal errors occurred.
In the U.S., appellate courts include the U.S. Courts of Appeals (Circuit Courts) and state-level appellate courts.

Supreme Courts

The highest judicial authority in a country or state.
They have the final say on legal disputes and interpret laws, including constitutional matters.
In the U.S., the Supreme Court of the United States (SCOTUS) is the highest court, while each state also has its own supreme court.

Constitutional Courts

These courts specifically rule on constitutional matters, determining whether laws or government actions violate the constitution.
Some countries have a separate constitutional court (e.g., Germany’s Federal Constitutional Court), while others, like the U.S., handle constitutional issues within their supreme court.

It's completely normal if these legal terms are new to you—we're all learning them together today.

👥

Discuss: Why do you think these researchers felt it was important to track the proportion of women judges on these courts? Do you think it is important? Why or why not?

Exploring the Dataset

Visualizations for the dataset are available here:

Visualizations - Women On High Courts

We will talk more about data visualizations in a later lesson. For now, take a few minutes to explore the visualizations on the site.

👥

Discuss: What are some key takeaways from the data visualizations that you noticed?

Working with the Dataset

Brief explanation of the dataset you’ll be working with and column names. Add a sentence about working with datasets in the wild! In your work as a data scientist, you’ll often need to comb through documentation to learn about the data.

Fork this SQL Studio template and follow along:

SQL Studio

Review Files

What files are present? What purpose do each of them serve?

Open up the wohc_documentation.pdf file and scroll to page 6. Take a few minutes to read through the descriptions of the different fields in the dataset.

wohc_documentation.pdf

306.1 KiB

🤖 AI Connection

As you read through the documentation, you'll likely encounter terms or field descriptions that are unfamiliar. That's expected when working with real-world datasets! If you come across a term you don't understand, try prompting an AI tool: "I'm a student exploring a dataset about women on judicial high courts. Can you explain what [TERM] means in simple language?" This is how data scientists work in practice! You won't always be an expert in the subject area of your data, and using available resources to build your understanding is part of the process.

✏️

Code-Along and Try-It

Complete the code-along and Try-It exercises in the template with your ILs and your group!

Exporting Data

After you complete the exercises, follow the directions at the end of main.sql to export your data! Open up the exported .csv file. Are the anomalies you fixed gone?

For a summary of this lesson, check out the 6. Lab: Women on High Courts One-Pager!