As the datasets you work with get larger, queries that were once lightning-fast can slow to a crawl. When this happens, it’s time to optimize. First, you’ll need data to figure out where to focus your optimization work. The explain command is a great place to start.

Using explain before a select statement produces the query execution plan, which is a list of steps the database performs to execute your query. It shows how the table will be scanned and which join algorithms are used when there are multiple tables.

Simple Group and Count

Here, we’ve added explain to a query we want to optimize:

explain select
  , source
  , sum(price)
  purchases JOIN users ON purchases.user_id =
group by
  , 2
limit 5000;

Using explain here produces the following query execution plan:

query plan

The steps in our plan are executed bottom to top. The first step is a scan and reading of all rows in the users table. The result is used to build a hash table, which will be used to join with purchases.

The purchases table is then read and joined to users, using the newly created hash table to look up the user_id field on the purchase record. The DS_DIST_BOTH indicates that both tables get distributed to various nodes for joining.

The HashAggregate builds a hash table to group by date and source. Finally, the limit would cut any rows in excess of 5000.

Each step comes with metadata attached — cost, rows and width. Cost is a relative value broken into two parts — startup cost, the cost to return the first row, and total cost, for the entire operation. Each step includes the cost of all steps below it. The cost is only useful for comparing inside a single query plan.

rows is the number of rows returned by the step. width is the average size of each row. If either estimate is badly off, run analyze on the table you’re querying and you may get a more efficient query plan.

When joining, there are four ways data might get distributed. In step 3 of our example above, DS_DIST_BOTH indicates that both tables must get distributed to all nodes. DS_DIST_INNER distributes the inner table to all nodes for processing, DST_BCAST_INNER broadcasts the entire inner table to all nodes, and DS_DIST_NONE indicates that all data is available for joining on each slice.

DS_DIST_NONE is generally the best option, unless there is a strong skew in the distribution of data and only a few nodes can work on the query. That can be determined by looking in svl_query_log.

Digging into svl_query_report

Sometimes you need more information than explain provides. This is where the svl_query_report table comes in. svl_query_report includes detailed information from stl_explain and the various data tables such as stl_hashjoin.

Each step shows how long it took on each slice, allowing detailed investigation of complicated queries. The label field indicates which table the step is working on, including internal tables created only for the query.

Any step which has a number in the workmem field allocated that much memory to store results. If this value exceeds the amount of working memory available to the query, it will use disk. This will set is_diskbased to true, and will slow down the query.

Taking a Look at stl_alert_event_log

One last place to look for underperforming queries is stl_alert_event_log. It contains queries which the optimizer has trouble with.

This can happen when filtering on a large amount of data that is not the sort key, or gigantic cross-joins.

For more detailed information, take a look at Amazon’s Database Developer Guide. As always, thanks for reading!