More Random Events
Following up on last week’s post on normal distributions, this week we’re going to show off another common probability distribution.
Let’s say you typically sell 5 widgets per day. How likely is it that you’ll sell 5 widgets tomorrow? What about between 4 and 6 widgets tomorrow? Obviously we can’t just guess randomly. And the normal distribution won’t help either.
Fortunately, this is what the Poisson Distribution is for. Its formula is:
Our Poisson Distribution formula takes 3 inputs:
- R: Our known rate, in this case 5.
- e: Euler’s Number, 2.71828.
- k: tomorrow’s expected rate.
Breaking It Down In SQL
Let’s start by plotting the Distribution over several expected conversion rates (k) per day. As before, we’ll use generate_series for convenience, so MySQL and Redshift users should try out alternatives.
with numbers as ( select * from generate_series(0, 20) k ) select k, (power(2.71828, -5) * power(5, k) / (k !)) pr from numbers order by 1
Using our favorite visualization tool, this gives us a graph that looks like this:
In other words, there’s an 18% chance that we’ll sell exactly 5 widgets tomorrow. The chances of selling more than 5 decrease rather quickly. There is virtually no chance of selling 15 widgets tomorrow.
Asking Specific Questions
At the top of the post, we asked how likely it was that we’d sell between 4 and 6 widgets tomorrow. To ask a specific question like that, we need only add a where clause:
where k between 4 and 6
Putting it all together, we get SQL that looks like this:
with numbers as ( select * from generate_series(0, 20) k ) select sum((power(2.71828, -5) * power(5, k) / (k !)) pr from numbers where k between 4 and 6 order by 1
In this case, there’s a 50% chance that we’ll sell between 4 and 6 widgets. Not bad!