Getting Familiar with Our Database Schema
Before we write any SQL, we need to learn how our data is structured. Any SQL tool worth its salt can tell you your database schema. For this post, we’ll use our favorite one.
First, we click on the handy schema tab.
Here, we see several tables, including a purchases table and a users table. Clicking on the purchases table reveals the price column where we find our payment data.
Doing the same for the users table reveals a source column. This is where we find our source data.
Starting Our Query
To start our query, we’ll grab the source for every user from our users table with the following:
select users.source from users
Here is the resulting table:
Now it’s time to add our revenue data. There’s just one problem: Our revenue data is in another table.
If only we had a way to grab data from two tables with a single query …
It’s time for a new and very important SQL statement: join. Join helps us match data from one table to data in another table. In our case, we want the source data for each user in the users table to match up with that user’s purchase data in the purchases table.
Looking at our schema, we see that purchases.user_id matches up with users.id. These are the columns we’ll use to join the two tables.
select users.source from users join purchases on users.id = purchases.user_id
Notice on and = in the query. on and = work together to tell the database which columns we want to join our tables on.
Now we need to add our pricing data to our query. We do so by adding purchases.price to the select statement.
select users.source, purchases.price from users join purchases on users.id = purchases.user_id
Voila! Here’s the table we have now:
Introducing the sum Function
In the last section, we created a table with all the raw data needed to figure out which traffic source has produced the most revenue.
But we’re after the total revenue for each source. This is where the sum() function comes in.
We’ll use sum() to sum up the revenue by source. We also need to add a group by 1 to our query to make sure the results are grouped by source.
Here’s the SQL:
select users.source, sum(purchases.price) from users join purchases on users.id = purchases.user_id group by 1
And here’s the pie chart we get with our data:
There you have it, folks. Organic traffic is our leading source for revenue!