Exact row counts for all tables in MySQL and Postgres
- Blog
- Tech Talk
Keeping track of your row counts can be helpful for budgeting and capacity planning, yet accurate counts are surprisingly hard to get. Databases keep row counts in their stats tables for planning query execution, but the results are only approximate…
Keeping track of your row counts can be helpful for budgeting and capacity planning, yet accurate counts are surprisingly hard to get.
Databases keep row counts in their stats tables for planning query execution, but the results are only approximate and can be badly out of date. You can get exact counts by running a count query for every table, but this can be tedious and require external scripting.
Instead, you can get exact row counts with a single query. This query is not going to be fast: It will take as long as running select count(1) from foo on every table in your database.
Our end result will be a table of schema, table, and row counts:
Exact row counts in Postgres
To start getting our row counts, we’ll need a list of our SQL tables. We can get this easily with:
select table_schema, table_name
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE'
From here, we need a way to turn the names of tables like ‘users’ into an executable SQL statement. While we can trivially concatenate on ‘select count(1) from ‘ to each table name, we need a way to actually run this constructed query.
Fortunately, postgres has the wondrous eval which executes strings of SQL. It’s only available from stored procedures, so we’ll write a custom function that invokes eval.
We’ll want our custom function to take a table name and return the number of rows in the query. Here’s how to write a count_rows function:
create or replace function
count_rows(schema text, tablename text) returns integer
as
$body$
declare
result integer;
query varchar;
begin
query := 'SELECT count(1) FROM ' || schema || '.' || tablename;
execute query into result;
return result;
end;
$body$
language plpgsql;
We can invoke this function like any other postgres function with count_rows(‘users’). Adding our count_rows function to the original query will get us row counts for each table:
select
table_schema,
table_name,
count_rows(table_schema, table_name)
from information_schema.tables
where
table_schema not in ('pg_catalog', 'information_schema')
and table_type='BASE TABLE'
order by 3 desc
With this definition and query combined we’ll have the row count table we were aiming for.
MySQL
The same trick works in MySQL, but user defined functions aren’t an option for everyone. For those with this restriction, we’ll build a single query with a union per table and eval the entire string.
The query uses group_concat which packs multiple rows into a single string. This is needed to turn a list of table names into a string of many counts connected by unions.
Finally, we can execute a string as a prepared statement. This is the simplest way to run a string as a query in MySQL.
We’ll start with creating all of the select statements we want to run:
select
concat(
'select ',
quote(db), ' db, ',
quote(tablename), ' tablename, '
'count(1) "rowcount" ' ,
'from ', db, '.', tablename) as sql_statements
from (
select
table_schema db,
table_name tablename
from information_schema.tables
where table_schema not in
('performance_schema', 'mysql', 'information_schema')
) t;
We can expand that query to concatenate the select statements, and save the results with into @sql. Once saved, we can run it as a prepared statement.
select
-- Sort the tables by count
concat(
'select * from (',
-- Aggregate rows into a single string connected by unions
group_concat(
-- Build a "select count(1) from db.tablename" per table
concat('select ',
quote(db), ' db, ',
quote(tablename), ' tablename, '
'count(1) "rowcount" ',
'from ', db, '.', tablename)
separator ' union ')
, ') t order by 3 desc')
into @sql
from (
select
table_schema db,
table_name tablename
from information_schema.tables
where table_schema not in
('performance_schema', 'mysql', 'information_schema')
) t;
-- Execute @sql
prepare s from @sql; execute s; deallocate prepare s;
This yields the same results as the approach we demonstrated for Postgres users.
Choices
We’ve given you a couple of tools to choose from for getting exact row counts. The user defined function is more readable, extensible, and reuseable. However, the group_concat approach is helpful if you do not have permission to create user defined functions, or if your database does not support them.
Go forth and count!
You now have a couple of ways to get exact row counts in the most common databases, and a few new SQL techniques to play with. Enjoy!