Daily vs. Weekly vs. Monthly
Depending on your data, looking at the same metric with daily, weekly, or monthly aggregations can yield startlingly different results. This is why SaaS companies will look at new customers by day, week, and month. Each metric tells a unique, but important story.
Let’s take a look at an analysis staple, new customers per day:
select date(created_at), count(1) from customers group by 1
Here’s our graph:
Mostly, we see pronounced day-of-week effects. The peaks have certainly increased in recent months, but it’s hard to say about the average.
Now let’s look at the weekly version:
select date_trunc('week', created_at), count(1) from customers group by 1
Here we see:
Now we can see the average is most definitely increasing. We can also clearly make out the trough we’d expect around the holidays.
Let’s see what a monthly aggregation can tell us:
select date_trunc('month', created_at), count(1) from customers group by 1
These results are predictably much clearer about the long-term trend:
All the seasonal effects are gone, and we see clearly that new customer growth was mostly flat for several months, and has started growing rapidly quite recently.
Given that all these aggregations tell us different things about the datasets, it would be ideal if one combined chart could communicate all the information.
The simplest way to write the query is to join all the above queries together using their dates:
select daily.day, daily_count, weekly_count, monthly_count from ( select date(created_at) as day, count(1) daily_count from customers group by 1 ) daily left join ( select date_trunc('week', created_at) as week, count(1) weekly_count from customers group by 1 ) weekly on week = day left join ( select date_trunc('month', created_at) as month, count(1) monthly_count from customers group by 1 ) monthly on month = day
Remember to join the monthly table on month = day, not month = week; otherwise, you’ll only include months whose first day happens to be the start of a week.
The results are accurate if not entirely useful:
In particular, the fact that we only get a new weekly number once a week, and a new monthly number once a month, is glaring. What if we could keep the broader perspective of the longer-term metrics while maintaining granularity?
Instead of one point on the monthly line per month, let’s plot a trailing month for every single day. (And let’s do the same thing for weeks.) We can do this with window functions!
Let’s take a look:
select d, daily_customers, sum(daily_customers) over ( order by d rows between 7 preceding and current row ) weekly_avg, sum(daily_customers) over ( order by d rows between 30 preceding and current row ) monthly_avg from ( select date(created_at) d, count(1) daily_customers from customers group by 1 ) daily_new_users
This gives us much more natural lines that still preserve the relative benefits of the daily, weekly and monthly aggregations:
As a bonus, we only have to do the table scan once, and we don’t have to do any joins. The query will run much faster on large datasets.
Bonus Round: Trailing Averages
Now that we have our window function, it’s easy to modify it to get additional information about our customers! For example, by changing sum to avg, we can see trailing weekly and monthly averages of our new customer growth:
select d, daily_users, avg(daily_users) over ( order by d rows between 7 preceding and current row ) weekly_avg, avg(daily_users) over ( order by d rows between 30 preceding and current row ) monthly_avg from ( select date(created_at) d, count(1) daily_users from customers group by 1 ) daily_new_users
By playing with the window definition, we can change our trailing average into any kind of running average we like!
Showing multiple aggregations on a single chart is one of the best ways to add context to your data. We hope these techniques for pulling that data have been helpful!