How to optimize database queries in Rails with EXPLAIN ANALYZE

Updated: 

This lesson is from Full Stack Rails Mastery.

EXPLAIN ANALYZE is a feature of PostgreSQL that helps you understand how a SQL query is executed by the database. It’s useful when you notice slow queries in your Rails app and want to pinpoint exactly where the bottleneck is, such as in a particular join, filter, or sort operation.
EXPLAIN ANALYZE shows you the actual steps the database takes to execute a query, along with how long each step takes. It’s like getting a detailed roadmap and timing for your query, so you can see exactly what’s happening under the hood.
While EXPLAIN ANALYZE is a feature specific to PostgreSQL, other databases have similar tools (like EXPLAIN in MySQL or EXECUTION PLAN in SQL Server). However, EXPLAIN ANALYZE in PostgreSQL is particularly detailed because it not only shows the plan but also executes the query to provide real timing data.
EXPLAIN ANALYZE SELECT * FROM users WHERE id = 1

To explain an existing ActiveRecord query, you’d need to convert it into raw SQL first. You can do that using the to_sql method on your ActiveRecord query, and then pass that SQL into EXPLAIN ANALYZE. Here’s how it works:
First, you get the SQL for your ActiveRecord query like this:
sql = YourModel.joins(:other_model).to_sql

Then, you can pass this SQL string into EXPLAIN ANALYZE either in psql or from the Rails console:
result = ActiveRecord::Base.connection.execute("EXPLAIN ANALYZE #{sql}")

This will execute your complex ActiveRecord query as raw SQL with EXPLAIN ANALYZE and return the detailed execution plan.
In Rails, you can also directly call the explain method on an ActiveRecord relation. This method runs an EXPLAIN on the query behind the scenes and gives you a summary of the execution plan.
For example, if you have a complex ActiveRecord query with joins, conditions, and ordering, you can simply chain .explain at the end of your query.
YourModel.joins(:other_model).where(condition: value).order(:some_column).explain

It’s a quick and easy way to see the execution plan without needing to convert it to SQL manually.
However, the explain method by itself doesn’t include the detailed timing information that EXPLAIN ANALYZE provides. It’s more of a high-level overview.
With EXPLAIN ANALYZE, you can find out several key things:
  1. Execution Plan: It shows the steps the database takes to execute your query, including scans, joins, and sorts.
  2. Timing: You get detailed timing for each step, helping you see where the query might be slow.
  3. Row Counts: It provides estimates and actual counts of rows processed at each step, which can reveal inefficiencies.
Overall, it helps identify performance bottlenecks and understand how your query is being executed.

Let’s look at a specific example.
First, we'll get the SQL query from an Active Record query:
> sql = User.joins(:enrolments).to_sql
=> "SELECT \"users\".* FROM \"users\" INNER JOIN \"enrolments\" ON \"enrolments\".\"student_id\" = \"users\".\"id\""

Then run EXPLAIN ANALYZE:
> result = ActiveRecord::Base.connection.execute("EXPLAIN analyze #{sql}")
   (17.7ms)  EXPLAIN analyze SELECT "users".* FROM "users" INNER JOIN "enrolments" ON "enrolments"."student_id" = "users"."id"
=> #<PG::Result:0x00007fa20bba4e38 status=PGRES_TUPLES_OK ntuples=9 nfields=1 cmd_tuples=0>

This returns a PG::Result object (via the pg gem). So we need to extract the query plan out of it:
> result.each do |row|
>   puts row['QUERY PLAN']
> end

Hash Join  (cost=739.65..1289.42 rows=19197 width=432) (actual time=5.447..15.863 rows=19306 loops=1)
  Hash Cond: (enrolments.student_id = users.id)      
  ->  Index Only Scan using index_enrolments_on_student_id on enrolments  (cost=0.29..499.65 rows=19197 width=4) (actual time=0.007..2.373 rows=19306 loops=1)                                           
        Heap Fetches: 2770                           
  ->  Hash  (cost=558.05..558.05 rows=14505 width=432) (actual time=5.409..5.411 rows=14416 loops=1)
        Buckets: 16384  Batches: 1  Memory Usage: 3218kB
        ->  Seq Scan on users  (cost=0.00..558.05 rows=14505 width=432) (actual time=0.008..1.524 rows=14416 loops=1)
Planning Time: 0.226 ms
Execution Time: 16.808 ms


In PostgreSQL, the cost units used in EXPLAIN output are abstract and are not directly tied to physical time measurements like seconds or milliseconds. Instead, they represent a relative measure of the resource consumption needed to execute a query. The cost is typically divided into two parts:
The cost values are used by PostgreSQL’s query planner to estimate the efficiency of different execution plans. While the units are not explicitly defined, a higher cost generally indicates a more resource-intensive operation. The actual resource usage depends on factors such as CPU, I/O, and memory.
Let’s break down the details from the EXPLAIN ANALYZE output line by line:
  1. The query planner chose a Hash Join strategy. It’s building a hash table from the users table and then probing it with rows from the enrolments table.
  2. For the enrolments table, it’s using an Index Only Scan, which is efficient. However, it had to do 2,770 heap fetches, indicating that some data wasn’t in the index and had to be fetched from the table itself.
  3. For the users table, it’s doing a Sequential Scan, which means it’s reading the entire table. This might happen if there sn’t an appropriate index on the id column.
  4. The actual number of rows (19,306) is very close to the estimated number (19,197), which indicates that the planner’s statistics are quite accurate.
  5. The total execution time is 16.808 ms, which is relatively fast for joining two tables and returning all columns from the users table.
This is a relatively simple example but even here EXPLAIN ANALYZE has pointed out some scope for improvement by adding indexes.