EXPLAIN

Описание параметров анализа производительности запросов.

Что такое EXPLAIN и зачем он нужен?

Команда EXPLAIN — это один из инструментов оптимизации SQL-запросов в PostgreSQL. Она показывает план выполнения — пошаговую инструкцию, которую оптимизатор запросов создает для получения данных. Понимание этого плана — ключ к выявлению "узких мест" в производительности.

EXPLAIN имеет множество параметров, каждый из которых добавляет определенный тип информации в вывод.

Синтаксис EXPLAIN

Базовый синтаксис команды:

sql

EXPLAIN [ ( параметр [, ...] ) ] ваш_запрос;
-- Параметры указываются в скобках через запятую
EXPLAIN (ANALYZE, BUFFERS, VERBOSE) SELECT * FROM table;

Параметры можно комбинировать в любом порядке. Некоторые из них требуют выполнения запроса (работают только с ANALYZE), другие показывают дополнительную информацию без выполнения.

Параметры EXPLAIN

Список параметров зависит от используемой версии PostgreSQL и форка.

1. ANALYZE

Самый важный параметр. Заставляет PostgreSQL выполнить запрос и показать фактические метрики выполнения, а не только оценки.

Без ANALYZE: показывает предполагаемый план на основе статистики таблиц
С ANALYZE: выполняет запрос и показывает реальное время, количество строк и другие фактические метрики

Важно: Для запросов INSERT, UPDATE, DELETE используйте в транзакции с ROLLBACK, чтобы не получить ложные данные.

2. VERBOSE

Включает вывод дополнительной детальной информации о плане выполнения:

Показывает список выходных столбцов для каждой операции
Добавляет информацию о выражениях, вычисляемых в каждом узле плана
Показывает алиасы таблиц и их схемы

sql

explain (verbose) 
select customer_id, first_name 
from customer
where active = 1;

text

QUERY PLAN                                                       |
-----------------------------------------------------------------+
Seq Scan on public.customer  (cost=0.00..47.49 rows=584 width=10)|
  Output: customer_id, first_name                                |
  Filter: (customer.active = 1)                                  |

В выводе вы увидите: Output: customer_id, first_name для узла Seq Scan.

3. COSTS

Показывает оценочную стоимость операций. Включен по умолчанию. Можно отключить для более компактного вывода.

sql

-- с costs (по умолчанию)
explain 
select * 
from film;

text

QUERY PLAN                                              |
--------------------------------------------------------+
Seq Scan on film  (cost=0.00..98.00 rows=1000 width=391)|

sql

-- без costs
explain (costs false) 
select * 
from film;

text

QUERY PLAN      |
----------------+
Seq Scan on film|

Стоимость измеряется в условных единицах и показывает:

Начальная стоимость (cost=0.00..): Затраты на получение первой строки
Полная стоимость (..98.00): Затраты на получение всех строк

4. SETTINGS

Показывает значения нестандартных параметров конфигурации PostgreSQL, которые влияют на план запроса. Полезно при переносе запросов между серверами с разными настройками.

sql

explain (analyze, settings) 
select * 
from film 
where length > 120;

text

QUERY PLAN                                                                                          |
----------------------------------------------------------------------------------------------------+
Seq Scan on film  (cost=0.00..100.50 rows=456 width=391) (actual time=0.053..1.761 rows=457 loops=1)|
  Filter: (length > 120)                                                                            |
  Rows Removed by Filter: 543                                                                       |
Settings: search_path = 'public, public, "$user"'                                                   |
Planning Time: 0.142 ms                                                                             |
Execution Time: 1.785 ms                                                                            |

В выводе появится секция Settings: с параметрами, отличающимися от стандартных, как пример: random_page_cost = 1.1, seq_page_cost = 1.0, work_mem = 4MB.

5. GENERIC_PLAN

Показывает "обобщенный" план для параметризованных запросов (prepared statements) без привязки к конкретным значениям параметров. Другими словами: использовать операторы с шаблонами параметров в виде $1 и создать общий план, не зависящий от значений параметров. Этот параметр нельзя использовать с ANALYZE.

sql

prepare film_query (int) as 
select * 
from film 
where film_id = $1;

explain (generic_plan) execute film_query(100);

text

QUERY PLAN                                                            |
----------------------------------------------------------------------+
Index Scan using film_pkey on film  (cost=0.28..8.29 rows=1 width=391)|
  Index Cond: (film_id = 100)                                         |

Используется для анализа того, как будет выполняться подготовленный запрос с разными параметрами.

6. BUFFERS

Требует ANALYZE. Показывает статистику использования буферного кэша — критически важная информация для анализа I/O операций.

sql

explain (analyze, buffers)
select * 
from rental
where rental_date > '2005-07-01';

text

QUERY PLAN                                                                                               |
---------------------------------------------------------------------------------------------------------+
Seq Scan on rental  (cost=0.00..350.55 rows=12623 width=36) (actual time=0.778..2.093 rows=12577 loops=1)|
  Filter: (rental_date > '2005-07-01 00:00:00'::timestamp without time zone)                             |
  Rows Removed by Filter: 3467                                                                           |
  Buffers: shared read=150                                                                               |
Planning:                                                                                                |
  Buffers: shared hit=84 read=9                                                                          |
Planning Time: 5.592 ms                                                                                  |
Execution Time: 2.424 ms                                                                                 |

В выводе добавляется строка Buffers: с метриками:

shared hit: Страницы, найденные в shared buffers (в памяти)
shared read: Страницы, прочитанные с диска
shared dirtied: Страницы, измененные в кэше (грязные страницы)
shared written: Страницы, записанные из кэша на диск
local hit: Страницы временных данных в локальном кэше
local read: Страницы временных данных, прочитанные с диска
local dirtied: Измененные страницы временных данных
local written: Страницы временных данных, записанные на диск
temp read: Чтение из временных файлов на диске
temp write: Запись во временные файлы на диске

7. WAL

Требует ANALYZE. Показывает статистику по Write-Ahead Log (журналу предзаписи) для операций изменения данных (INSERT, UPDATE, DELETE).

sql

begin;
explain (analyze, wal) 
update payment set amount = amount * 1.1 where amount < 2;
rollback;  -- откатываем изменения

text

QUERY PLAN                                                                                                    |
--------------------------------------------------------------------------------------------------------------+
Update on payment  (cost=0.00..335.17 rows=0 width=0) (actual time=60.664..60.665 rows=0 loops=1)             |
  WAL: records=24912 fpi=247 bytes=3982127                                                                    |
  ->  Seq Scan on payment  (cost=0.00..335.17 rows=3566 width=18) (actual time=0.013..1.836 rows=3566 loops=1)|
        Filter: (amount < '2'::numeric)                                                                       |
        Rows Removed by Filter: 12301                                                                         |
Planning Time: 0.075 ms                                                                                       |
Execution Time: 62.824 ms                                                                                     |

Показывает:

WAL records: Количество записей WAL
WAL bytes: Объем данных, записанных в WAL
WAL fpi: Количество full page images (записей полных страниц)

8. TIMING

Показывает фактическое время выполнения для каждой операции. Включен по умолчанию при использовании ANALYZE. Можно отключить для снижения накладных расходов на замер времени.

sql

-- с замером времени (по умолчанию)
explain (analyze, timing on) 
select * 
from address;

text

QUERY PLAN                                                                                            |
------------------------------------------------------------------------------------------------------+
Seq Scan on address  (cost=0.00..12.03 rows=603 width=161) (actual time=0.011..0.036 rows=603 loops=1)|
Planning Time: 0.043 ms                                                                               |
Execution Time: 0.054 ms                                                                              |

sql

-- без замера времени (быстрее, особенно на быстрых запросах)
explain (analyze, timing off) 
select * 
from address;

text

QUERY PLAN                                                                          |
------------------------------------------------------------------------------------+
Seq Scan on address  (cost=0.00..12.03 rows=603 width=161) (actual rows=603 loops=1)|
Planning Time: 0.043 ms                                                             |
Execution Time: 0.038 ms                                                            |

Отключайте TIMING при анализе очень быстрых запросов или в средах с высокой нагрузкой, где точность замера времени может быть снижена.

9. SUMMARY

Добавляет в конец вывода сводную информацию. Включен по умолчанию при использовании ANALYZE.

sql

explain (analyze, summary on) 
select count(*) 
from payment;

text

QUERY PLAN                                                                                                     |
---------------------------------------------------------------------------------------------------------------+
Aggregate  (cost=342.34..342.35 rows=1 width=8) (actual time=1.096..1.097 rows=1 loops=1)                      |
  ->  Seq Scan on payment  (cost=0.00..302.67 rows=15867 width=0) (actual time=0.009..0.719 rows=15867 loops=1)|
Planning Time: 0.469 ms                                                                                        |
Execution Time: 1.109 ms                                                                                       |

sql

-- без сводки
explain (analyze, summary off) 
select count(*) 
from payment;

text

QUERY PLAN                                                                                                     |
---------------------------------------------------------------------------------------------------------------+
Aggregate  (cost=342.34..342.35 rows=1 width=8) (actual time=1.194..1.194 rows=1 loops=1)                      |
  ->  Seq Scan on payment  (cost=0.00..302.67 rows=15867 width=0) (actual time=0.012..0.757 rows=15867 loops=1)|

Сводка показывает:

Planning Time: Время, затраченное на построение плана
Execution Time: Время выполнения запроса
JIT compilation time: Если использовалась JIT-компиляция (PostgreSQL 11+)

10. MEMORY

Показывает использование памяти для операций, требующих значительных объемов (хеш-соединения, сортировки, агрегации).

sql

explain (analyze, memory)
select c.customer_id, count(r.rental_id)
from customer c
join rental r on c.customer_id = r.customer_id
group by c.customer_id
having count(r.rental_id) > 30;

text

QUERY PLAN                                                                                                               |
-------------------------------------------------------------------------------------------------------------------------+
HashAggregate  (cost=486.55..494.04 rows=200 width=12) (actual time=3.862..3.892 rows=134 loops=1)                       |
  Group Key: c.customer_id                                                                                               |
  Filter: (count(r.rental_id) > 30)                                                                                      |
  Batches: 1  Memory Usage: 105kB                                                                                        |
  Rows Removed by Filter: 465                                                                                            |
  ->  Hash Join  (cost=53.48..406.33 rows=16044 width=8) (actual time=0.135..2.543 rows=16044 loops=1)                   |
        Hash Cond: (r.customer_id = c.customer_id)                                                                       |
        ->  Seq Scan on rental r  (cost=0.00..310.44 rows=16044 width=6) (actual time=0.012..0.705 rows=16044 loops=1)   |
        ->  Hash  (cost=45.99..45.99 rows=599 width=4) (actual time=0.118..0.118 rows=599 loops=1)                       |
              Buckets: 1024  Batches: 1  Memory Usage: 30kB                                                              |
              ->  Seq Scan on customer c  (cost=0.00..45.99 rows=599 width=4) (actual time=0.006..0.081 rows=599 loops=1)|
Planning:                                                                                                                |
  Memory: used=42kB  allocated=64kB                                                                                      |
Planning Time: 0.218 ms                                                                                                  |
Execution Time: 3.921 ms                                                                                                 |

Показывает:

Memory Usage: Пиковое использование памяти для операции
Memory Allocated: Выделенная память
Disk Usage: Если операция использовала временные файлы на диске

11. FORMAT `{ TEXT | XML | JSON | YAML }`

Определяет формат вывода плана выполнения. По умолчанию: TEXT.

sql

-- текстовый формат (читаемый человеком)
explain (analyze, format text) 
select * 
from film;

-- json формат (удобен для программной обработки)
explain (analyze, format json) 
select * 
from film;

-- xml формат
explain (analyze, format xml) 
select * 
from film;

-- yaml формат
explain (analyze, format yaml) 
select * 
from film;

TEXT — для чтения человеком в консоли.
JSON/XML/YAML — для парсинга программами, визуализаторами (pgAdmin, DBeaver), системами мониторинга.

Практические комбинации параметров

Вот наиболее полезные комбинации параметров для разных сценариев:

Для быстрого анализа производительности:

sql

-- стандартный анализ (самый частый сценарий)
explain (analyze, buffers, timing on, summary on)
select * 
from payment 
where amount > 10;

-- в кратком формате
explain (analyze, buffers)
select * 
from payment 
where amount > 10;

text

QUERY PLAN                                                                                            |
------------------------------------------------------------------------------------------------------+
Seq Scan on payment  (cost=0.00..342.34 rows=114 width=27) (actual time=0.015..0.970 rows=114 loops=1)|
  Filter: (amount > '10'::numeric)                                                                    |
  Rows Removed by Filter: 15753                                                                       |
  Buffers: shared hit=144                                                                             |
Planning Time: 0.067 ms                                                                               |
Execution Time: 0.979 ms                                                                              |

Для глубокого анализа проблем с памятью и I/O:

sql

-- полный анализ 
explain (analyze, buffers, verbose, memory, settings)
select c.first_name, c.last_name, sum(p.amount)
from customer c
join payment p on c.customer_id = p.customer_id
group by c.customer_id
order by sum(p.amount) desc
limit 100;

text

QUERY PLAN                                                                                                                                   |
---------------------------------------------------------------------------------------------------------------------------------------------+
Limit  (cost=507.80..508.05 rows=100 width=49) (actual time=4.800..4.807 rows=100 loops=1)                                                   |
  Output: c.first_name, c.last_name, (sum(p.amount)), c.customer_id                                                                          |
  Buffers: shared hit=184                                                                                                                    |
  ->  Sort  (cost=507.80..509.30 rows=599 width=49) (actual time=4.799..4.801 rows=100 loops=1)                                              |
        Output: c.first_name, c.last_name, (sum(p.amount)), c.customer_id                                                                    |
        Sort Key: (sum(p.amount)) DESC                                                                                                       |
        Sort Method: top-N heapsort  Memory: 36kB                                                                                            |
        Buffers: shared hit=184                                                                                                              |
        ->  HashAggregate  (cost=477.42..484.91 rows=599 width=49) (actual time=4.552..4.621 rows=599 loops=1)                               |
              Output: c.first_name, c.last_name, sum(p.amount), c.customer_id                                                                |
              Group Key: c.customer_id                                                                                                       |
              Batches: 1  Memory Usage: 297kB                                                                                                |
              Buffers: shared hit=184                                                                                                        |
              ->  Hash Join  (cost=53.48..398.09 rows=15867 width=24) (actual time=0.150..2.908 rows=15867 loops=1)                          |
                    Output: c.customer_id, c.first_name, c.last_name, p.amount                                                               |
                    Inner Unique: true                                                                                                       |
                    Hash Cond: (p.customer_id = c.customer_id)                                                                               |
                    Buffers: shared hit=184                                                                                                  |
                    ->  Seq Scan on public.payment p  (cost=0.00..302.67 rows=15867 width=9) (actual time=0.009..0.863 rows=15867 loops=1)   |
                          Output: p.payment_id, p.customer_id, p.staff_id, p.rental_id, p.amount, p.payment_date                             |
                          Buffers: shared hit=144                                                                                            |
                    ->  Hash  (cost=45.99..45.99 rows=599 width=17) (actual time=0.137..0.137 rows=599 loops=1)                              |
                          Output: c.first_name, c.last_name, c.customer_id                                                                   |
                          Buckets: 1024  Batches: 1  Memory Usage: 39kB                                                                      |
                          Buffers: shared hit=40                                                                                             |
                          ->  Seq Scan on public.customer c  (cost=0.00..45.99 rows=599 width=17) (actual time=0.006..0.078 rows=599 loops=1)|
                                Output: c.first_name, c.last_name, c.customer_id                                                             |
                                Buffers: shared hit=40                                                                                       |
Settings: search_path = 'public, public, "$user"'                                                                                            |
Planning:                                                                                                                                    |
  Buffers: shared hit=12                                                                                                                     |
  Memory: used=49kB  allocated=64kB                                                                                                          |
Planning Time: 0.174 ms                                                                                                                      |
Execution Time: 4.878 ms                                                                                                                     |

Для отладки DML-операций (INSERT/UPDATE/DELETE):

sql

begin;
explain (analyze, buffers, wal, verbose)
update inventory 
set last_update = now() 
where film_id in (select film_id from film where length > 180);
rollback;  -- не забывайте откатывать в тестовой среде!

text

QUERY PLAN                                                                                                                 |
---------------------------------------------------------------------------------------------------------------------------+
Update on public.inventory  (cost=100.99..184.77 rows=0 width=0) (actual time=17.057..17.058 rows=0 loops=1)               |
  Buffers: shared hit=1682 read=56 dirtied=45 written=1                                                                    |
  WAL: records=778 fpi=43 bytes=377149                                                                                     |
  ->  Hash Join  (cost=100.99..184.77 rows=179 width=20) (actual time=0.236..1.250 rows=155 loops=1)                       |
        Output: now(), inventory.ctid, film.ctid                                                                           |
        Inner Unique: true                                                                                                 |
        Hash Cond: (inventory.film_id = film.film_id)                                                                      |
        Buffers: shared hit=89 read=24                                                                                     |
        ->  Seq Scan on public.inventory  (cost=0.00..70.81 rows=4581 width=8) (actual time=0.032..0.789 rows=4581 loops=1)|
              Output: inventory.ctid, inventory.film_id                                                                    |
              Buffers: shared hit=1 read=24                                                                                |
        ->  Hash  (cost=100.50..100.50 rows=39 width=10) (actual time=0.185..0.185 rows=39 loops=1)                        |
              Output: film.ctid, film.film_id                                                                              |
              Buckets: 1024  Batches: 1  Memory Usage: 10kB                                                                |
              Buffers: shared hit=88                                                                                       |
              ->  Seq Scan on public.film  (cost=0.00..100.50 rows=39 width=10) (actual time=0.009..0.181 rows=39 loops=1) |
                    Output: film.ctid, film.film_id                                                                        |
                    Filter: (film.length > 180)                                                                            |
                    Rows Removed by Filter: 961                                                                            |
                    Buffers: shared hit=88                                                                                 |
Planning:                                                                                                                  |
  Buffers: shared hit=53 read=5                                                                                            |
Planning Time: 2.995 ms                                                                                                    |
Trigger last_updated: time=0.551 calls=155                                                                                 |
Execution Time: 17.098 ms                                                                                                  |

Для экспорта плана в инструменты визуализации:

sql

-- экспорт в json для анализа в инструментах визуализации
explain (analyze, buffers, verbose, format json)
select f.title, count(r.rental_id) as rental_count
from film f
join inventory i on f.film_id = i.film_id
join rental r on i.inventory_id = r.inventory_id
group by f.film_id
order by rental_count desc
limit 20;

text

QUERY PLAN                                                                      
--------------------------------------------------------------------------------
[¶  {¶    "Plan": {¶      "Node Type": "Limit",¶      "Parallel Aware": false,¶ ...

Практический пример: Анализ сложного запроса

sql

-- первый вариант запроса до оптимизации:
explain (analyze, buffers, verbose)
select distinct c.customer_id, concat(c.last_name, ' ', c.first_name), 
	count(i.film_id) over (partition by c.customer_id), 
	sum(p.amount) over (partition by c.customer_id)
from customer c
join payment p on c.customer_id = p.customer_id
join rental r on r.rental_id = p.rental_id
join inventory i on r.inventory_id = i.inventory_id;

text

QUERY PLAN                                                                                                                                               |
---------------------------------------------------------------------------------------------------------------------------------------------------------+
HashAggregate  (cost=2748.31..2946.65 rows=15867 width=76) (actual time=17.780..17.844 rows=599 loops=1)                                                 |
  Output: c.customer_id, (concat(c.last_name, ' ', c.first_name)), (count(i.film_id) OVER (?)), (sum(p.amount) OVER (?))                                 |
  Group Key: c.customer_id, concat(c.last_name, ' ', c.first_name), count(i.film_id) OVER (?), sum(p.amount) OVER (?)                                    |
  Batches: 1  Memory Usage: 849kB                                                                                                                        |
  Buffers: shared hit=360                                                                                                                                |
  ->  WindowAgg  (cost=2233.13..2589.64 rows=15867 width=76) (actual time=10.377..15.233 rows=15867 loops=1)                                             |
        Output: c.customer_id, concat(c.last_name, ' ', c.first_name), count(i.film_id) OVER (?), sum(p.amount) OVER (?)                                 |
        Buffers: shared hit=360                                                                                                                          |
        ->  Sort  (cost=2232.63..2272.30 rows=15867 width=26) (actual time=10.358..10.757 rows=15867 loops=1)                                            |
              Output: c.customer_id, c.last_name, c.first_name, i.film_id, p.amount                                                                      |
              Sort Key: c.customer_id                                                                                                                    |
              Sort Method: quicksort  Memory: 1122kB                                                                                                     |
              Buffers: shared hit=360                                                                                                                    |
              ->  Hash Join  (cost=697.66..1125.61 rows=15867 width=26) (actual time=2.652..8.864 rows=15867 loops=1)                                    |
                    Output: c.customer_id, c.last_name, c.first_name, i.film_id, p.amount                                                                |
                    Inner Unique: true                                                                                                                   |
                    Hash Cond: (r.inventory_id = i.inventory_id)                                                                                         |
                    Buffers: shared hit=360                                                                                                              |
                    ->  Hash Join  (cost=564.47..950.74 rows=15867 width=28) (actual time=2.004..6.503 rows=15867 loops=1)                               |
                          Output: c.customer_id, c.last_name, c.first_name, p.amount, r.inventory_id                                                     |
                          Inner Unique: true                                                                                                             |
                          Hash Cond: (p.rental_id = r.rental_id)                                                                                         |
                          Buffers: shared hit=334                                                                                                        |
                          ->  Hash Join  (cost=53.48..398.09 rows=15867 width=28) (actual time=0.155..2.721 rows=15867 loops=1)                          |
                                Output: c.customer_id, c.last_name, c.first_name, p.amount, p.rental_id                                                  |
                                Inner Unique: true                                                                                                       |
                                Hash Cond: (p.customer_id = c.customer_id)                                                                               |
                                Buffers: shared hit=184                                                                                                  |
                                ->  Seq Scan on public.payment p  (cost=0.00..302.67 rows=15867 width=13) (actual time=0.005..0.680 rows=15867 loops=1)  |
                                      Output: p.payment_id, p.customer_id, p.staff_id, p.rental_id, p.amount, p.payment_date                             |
                                      Buffers: shared hit=144                                                                                            |
                                ->  Hash  (cost=45.99..45.99 rows=599 width=17) (actual time=0.145..0.146 rows=599 loops=1)                              |
                                      Output: c.customer_id, c.last_name, c.first_name                                                                   |
                                      Buckets: 1024  Batches: 1  Memory Usage: 38kB                                                                      |
                                      Buffers: shared hit=40                                                                                             |
                                      ->  Seq Scan on public.customer c  (cost=0.00..45.99 rows=599 width=17) (actual time=0.006..0.089 rows=599 loops=1)|
                                            Output: c.customer_id, c.last_name, c.first_name                                                             |
                                            Buffers: shared hit=40                                                                                       |
                          ->  Hash  (cost=310.44..310.44 rows=16044 width=8) (actual time=1.820..1.821 rows=16044 loops=1)                               |
                                Output: r.rental_id, r.inventory_id                                                                                      |
                                Buckets: 16384  Batches: 1  Memory Usage: 755kB                                                                          |
                                Buffers: shared hit=150                                                                                                  |
                                ->  Seq Scan on public.rental r  (cost=0.00..310.44 rows=16044 width=8) (actual time=0.005..0.927 rows=16044 loops=1)    |
                                      Output: r.rental_id, r.inventory_id                                                                                |
                                      Buffers: shared hit=150                                                                                            |
                    ->  Hash  (cost=73.64..73.64 rows=4764 width=6) (actual time=0.626..0.627 rows=4581 loops=1)                                         |
                          Output: i.film_id, i.inventory_id                                                                                              |
                          Buckets: 8192  Batches: 1  Memory Usage: 243kB                                                                                 |
                          Buffers: shared hit=26                                                                                                         |
                          ->  Seq Scan on public.inventory i  (cost=0.00..73.64 rows=4764 width=6) (actual time=0.013..0.311 rows=4581 loops=1)          |
                                Output: i.film_id, i.inventory_id                                                                                        |
                                Buffers: shared hit=26                                                                                                   |
Planning:                                                                                                                                                |
  Buffers: shared hit=32                                                                                                                                 |
Planning Time: 0.373 ms                                                                                                                                  |
Execution Time: 18.296 ms                                                                                                                                |

Какие выводы можем сделать:

Избыточное использование DISTINCT, чтобы "схлопнуть" результат из-за использования оконных функций:
HashAggregate (cost=2748.31..2946.65 rows=15867 width=76) (actual time=17.780..17.844 rows=599)
Обрабатывает 15867 строк, но возвращает только 599 (по одному на клиента).
Использует 849kB памяти для удаления дубликатов.
Лишняя сортировка для оконных функций:
Sort (actual time=10.358..10.757 rows=15867 loops=1)
Sort Method: quicksort Memory: 1122kB
Сортирует все 15867 строк только для работы оконных функций.
Занимает 1.1MB памяти.
WindowAgg — дополнительный проход по данным:
WindowAgg (actual time=10.377..15.233 rows=15867)
Еще один этап обработки всех 15867 строк.
Добавляет 4.9 мс к общему времени.
Положительным является то, что все данные в кэше: Buffers: shared hit=360.
Используется Hash Join вместо Nested Loop.

Вывод — нужно избавиться от оконных функций, DISTINCT.

sql

-- второй вариант с частичной оптимизацией
explain (analyze, buffers, verbose)
select c.customer_id, concat(c.last_name, ' ', c.first_name), count(i.film_id), sum(p.amount)
from customer c
join payment p on c.customer_id = p.customer_id
join rental r on r.rental_id = p.rental_id
join inventory i on r.inventory_id = i.inventory_id
group by c.customer_id;

text

QUERY PLAN                                                                                                                                   |
---------------------------------------------------------------------------------------------------------------------------------------------+
HashAggregate  (cost=1244.62..1253.60 rows=599 width=76) (actual time=10.671..10.791 rows=599 loops=1)                                       |
  Output: c.customer_id, concat(c.last_name, ' ', c.first_name), count(i.film_id), sum(p.amount)                                             |
  Group Key: c.customer_id                                                                                                                   |
  Batches: 1  Memory Usage: 297kB                                                                                                            |
  Buffers: shared hit=360                                                                                                                    |
  ->  Hash Join  (cost=697.66..1125.61 rows=15867 width=26) (actual time=2.686..8.917 rows=15867 loops=1)                                    |
        Output: c.customer_id, c.last_name, c.first_name, i.film_id, p.amount                                                                |
        Inner Unique: true                                                                                                                   |
        Hash Cond: (r.inventory_id = i.inventory_id)                                                                                         |
        Buffers: shared hit=360                                                                                                              |
        ->  Hash Join  (cost=564.47..950.74 rows=15867 width=28) (actual time=2.032..6.567 rows=15867 loops=1)                               |
              Output: c.customer_id, c.last_name, c.first_name, p.amount, r.inventory_id                                                     |
              Inner Unique: true                                                                                                             |
              Hash Cond: (p.rental_id = r.rental_id)                                                                                         |
              Buffers: shared hit=334                                                                                                        |
              ->  Hash Join  (cost=53.48..398.09 rows=15867 width=28) (actual time=0.151..2.723 rows=15867 loops=1)                          |
                    Output: c.customer_id, c.last_name, c.first_name, p.amount, p.rental_id                                                  |
                    Inner Unique: true                                                                                                       |
                    Hash Cond: (p.customer_id = c.customer_id)                                                                               |
                    Buffers: shared hit=184                                                                                                  |
                    ->  Seq Scan on public.payment p  (cost=0.00..302.67 rows=15867 width=13) (actual time=0.005..0.672 rows=15867 loops=1)  |
                          Output: p.payment_id, p.customer_id, p.staff_id, p.rental_id, p.amount, p.payment_date                             |
                          Buffers: shared hit=144                                                                                            |
                    ->  Hash  (cost=45.99..45.99 rows=599 width=17) (actual time=0.141..0.142 rows=599 loops=1)                              |
                          Output: c.customer_id, c.last_name, c.first_name                                                                   |
                          Buckets: 1024  Batches: 1  Memory Usage: 38kB                                                                      |
                          Buffers: shared hit=40                                                                                             |
                          ->  Seq Scan on public.customer c  (cost=0.00..45.99 rows=599 width=17) (actual time=0.005..0.084 rows=599 loops=1)|
                                Output: c.customer_id, c.last_name, c.first_name                                                             |
                                Buffers: shared hit=40                                                                                       |
              ->  Hash  (cost=310.44..310.44 rows=16044 width=8) (actual time=1.841..1.841 rows=16044 loops=1)                               |
                    Output: r.rental_id, r.inventory_id                                                                                      |
                    Buckets: 16384  Batches: 1  Memory Usage: 755kB                                                                          |
                    Buffers: shared hit=150                                                                                                  |
                    ->  Seq Scan on public.rental r  (cost=0.00..310.44 rows=16044 width=8) (actual time=0.005..0.955 rows=16044 loops=1)    |
                          Output: r.rental_id, r.inventory_id                                                                                |
                          Buffers: shared hit=150                                                                                            |
        ->  Hash  (cost=73.64..73.64 rows=4764 width=6) (actual time=0.633..0.633 rows=4581 loops=1)                                         |
              Output: i.film_id, i.inventory_id                                                                                              |
              Buckets: 8192  Batches: 1  Memory Usage: 243kB                                                                                 |
              Buffers: shared hit=26                                                                                                         |
              ->  Seq Scan on public.inventory i  (cost=0.00..73.64 rows=4764 width=6) (actual time=0.011..0.311 rows=4581 loops=1)          |
                    Output: i.film_id, i.inventory_id                                                                                        |
                    Buffers: shared hit=26                                                                                                   |
Planning:                                                                                                                                    |
  Buffers: shared hit=32                                                                                                                     |
Planning Time: 0.344 ms                                                                                                                      |
Execution Time: 10.946 ms                                                                                                                    |

Какие выводы можем сделать:

Убрали DISTINCT, таким образом HashAggregate стал эффективнее:
HashAggregate (actual time=10.671..10.791 rows=599)
Batches: 1 Memory Usage: 297kB
Время: 10.8 мс вместо 17.8 мс (ускорение на 40%)
Память: 297kB вместо 849kB (экономия 65%)
Исчезли ненужные операции:
Убрали Sort
Убрали WindowAgg
Общее время выполнения: 18.3 мс → 10.9 мс (ускорение на 40%)

Что можно улучшить? Сократить количество данных в агрегации.

sql

-- третий, оптимизированный запрос
explain (analyze, buffers, verbose)
select c.customer_id, concat(c.last_name, ' ', c.first_name), count, sum
from customer c
join (
	select customer_id, sum(amount) 
	from payment
	group by customer_id) p on c.customer_id = p.customer_id
join (
	select r.customer_id, count(i.film_id)
	from rental r 
	join inventory i on r.inventory_id = i.inventory_id
	group by r.customer_id) r on r.customer_id = c.customer_id;

text

QUERY PLAN                                                                                                                                                            |
----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Hash Join  (cost=988.44..1039.09 rows=599 width=76) (actual time=6.764..7.045 rows=599 loops=1)                                                                       |
  Output: c.customer_id, concat(c.last_name, ' ', c.first_name), r.count, p.sum                                                                                       |
  Inner Unique: true                                                                                                                                                  |
  Hash Cond: (c.customer_id = r.customer_id)                                                                                                                          |
  Buffers: shared hit=360                                                                                                                                             |
  ->  Hash Join  (cost=402.97..450.54 rows=599 width=51) (actual time=2.475..2.638 rows=599 loops=1)                                                                  |
        Output: c.customer_id, c.last_name, c.first_name, p.sum, p.customer_id                                                                                        |
        Inner Unique: true                                                                                                                                            |
        Hash Cond: (c.customer_id = p.customer_id)                                                                                                                    |
        Buffers: shared hit=184                                                                                                                                       |
        ->  Seq Scan on public.customer c  (cost=0.00..45.99 rows=599 width=17) (actual time=0.010..0.071 rows=599 loops=1)                                           |
              Output: c.customer_id, c.store_id, c.first_name, c.last_name, c.email, c.address_id, c.activebool, c.create_date, c.last_update, c.active, c.preferences|
              Buffers: shared hit=40                                                                                                                                  |
        ->  Hash  (cost=395.48..395.48 rows=599 width=34) (actual time=2.461..2.462 rows=599 loops=1)                                                                 |
              Output: p.sum, p.customer_id                                                                                                                            |
              Buckets: 1024  Batches: 1  Memory Usage: 33kB                                                                                                           |
              Buffers: shared hit=144                                                                                                                                 |
              ->  Subquery Scan on p  (cost=382.01..395.48 rows=599 width=34) (actual time=2.335..2.422 rows=599 loops=1)                                             |
                    Output: p.sum, p.customer_id                                                                                                                      |
                    Buffers: shared hit=144                                                                                                                           |
                    ->  HashAggregate  (cost=382.01..389.49 rows=599 width=34) (actual time=2.334..2.394 rows=599 loops=1)                                            |
                          Output: payment.customer_id, sum(payment.amount)                                                                                            |
                          Group Key: payment.customer_id                                                                                                              |
                          Batches: 1  Memory Usage: 297kB                                                                                                             |
                          Buffers: shared hit=144                                                                                                                     |
                          ->  Seq Scan on public.payment  (cost=0.00..302.67 rows=15867 width=9) (actual time=0.005..0.680 rows=15867 loops=1)                        |
                                Output: payment.payment_id, payment.customer_id, payment.staff_id, payment.rental_id, payment.amount, payment.payment_date            |
                                Buffers: shared hit=144                                                                                                               |
  ->  Hash  (cost=577.98..577.98 rows=599 width=10) (actual time=4.282..4.283 rows=599 loops=1)                                                                       |
        Output: r.count, r.customer_id                                                                                                                                |
        Buckets: 1024  Batches: 1  Memory Usage: 33kB                                                                                                                 |
        Buffers: shared hit=176                                                                                                                                       |
        ->  Subquery Scan on r  (cost=566.00..577.98 rows=599 width=10) (actual time=4.180..4.244 rows=599 loops=1)                                                   |
              Output: r.count, r.customer_id                                                                                                                          |
              Buffers: shared hit=176                                                                                                                                 |
              ->  HashAggregate  (cost=566.00..571.99 rows=599 width=10) (actual time=4.180..4.217 rows=599 loops=1)                                                  |
                    Output: r_1.customer_id, count(i.film_id)                                                                                                         |
                    Group Key: r_1.customer_id                                                                                                                        |
                    Batches: 1  Memory Usage: 105kB                                                                                                                   |
                    Buffers: shared hit=176                                                                                                                           |
                    ->  Hash Join  (cost=133.19..485.78 rows=16044 width=4) (actual time=0.535..2.916 rows=16044 loops=1)                                             |
                          Output: r_1.customer_id, i.film_id                                                                                                          |
                          Inner Unique: true                                                                                                                          |
                          Hash Cond: (r_1.inventory_id = i.inventory_id)                                                                                              |
                          Buffers: shared hit=176                                                                                                                     |
                          ->  Seq Scan on public.rental r_1  (cost=0.00..310.44 rows=16044 width=6) (actual time=0.010..0.633 rows=16044 loops=1)                     |
                                Output: r_1.rental_id, r_1.rental_date, r_1.inventory_id, r_1.customer_id, r_1.return_date, r_1.staff_id, r_1.last_update             |
                                Buffers: shared hit=150                                                                                                               |
                          ->  Hash  (cost=73.64..73.64 rows=4764 width=6) (actual time=0.502..0.503 rows=4581 loops=1)                                                |
                                Output: i.film_id, i.inventory_id                                                                                                     |
                                Buckets: 8192  Batches: 1  Memory Usage: 243kB                                                                                        |
                                Buffers: shared hit=26                                                                                                                |
                                ->  Seq Scan on public.inventory i  (cost=0.00..73.64 rows=4764 width=6) (actual time=0.007..0.255 rows=4581 loops=1)                 |
                                      Output: i.film_id, i.inventory_id                                                                                               |
                                      Buffers: shared hit=26                                                                                                          |
Planning:                                                                                                                                                             |
  Buffers: shared hit=14                                                                                                                                              |
Planning Time: 0.284 ms                                                                                                                                               |
Execution Time: 7.160 ms                                                                                                                                              |

Какие выводы можем сделать:

Параллелизация агрегаций:
Subquery Scan on p (actual time=2.335..2.422 rows=599 loops=1)
Subquery Scan on r (actual time=4.180..4.244 rows=599 loops=1)
Две агрегации выполняются независимо.
Меньше потребляемой памяти в финальной агрегации:
Финальный HashAggregate отсутствует.
Вместо него — два небольших HashAggregate (297kB и 105kB).
Общее использование памяти: ~435kB vs ~542kB во втором варианте.
Эффективное использование Hash Join:
Hash Join (actual time=6.764..7.045 rows=599 loops=1)
Соединяются уже агрегированные данные (по 599 строк с каждой стороны).

Таким образом время сократилось с 18.3 мс → 10.9 мс → 7.16 мс.
Затрачиваемая память сократилась с ~2MB → ~1.1MB → ~0.6MB.

При масштабировании данных данный подход всегда будет эффективнее.

Заключение и рекомендации

Основные выводы:

Всегда начинайте с EXPLAIN (ANALYZE, BUFFERS) — это дает 90% нужной информации
Используйте VERBOSE при анализе сложных запросов с множеством столбцов
SETTINGS помогает понять разницу в выполнении на разных серверах
Для DML-операций добавляйте WAL и всегда оборачивайте в транзакцию с ROLLBACK

Еще раз вспомним частые комбинации:

sql

-- для повседневного анализа
explain (analyze, buffers) ваш_запрос;

-- для глубокой отладки  
explain (analyze, buffers, verbose, settings) ваш_запрос;

-- для dml операций
begin;
explain (analyze, buffers, wal) ваш_dml_запрос;
rollback;

-- для экспорта в инструменты
explain (analyze, buffers, format json) ваш_запрос;

Помните: EXPLAIN — это не магия, а инструмент. Понимание его вывода приходит с практикой. Начните с простых запросов, постепенно усложняя их, и сравнивайте планы выполнения.

EXPLAIN ​

Что такое EXPLAIN и зачем он нужен? ​

Синтаксис EXPLAIN ​

Параметры EXPLAIN ​

1. ANALYZE ​

2. VERBOSE ​

3. COSTS ​

4. SETTINGS ​

5. GENERIC_PLAN ​

6. BUFFERS ​

7. WAL ​

8. TIMING ​

9. SUMMARY ​

10. MEMORY ​

11. FORMAT { TEXT | XML | JSON | YAML } ​

Практические комбинации параметров ​

Для быстрого анализа производительности: ​

Для глубокого анализа проблем с памятью и I/O: ​

Для отладки DML-операций (INSERT/UPDATE/DELETE): ​

Для экспорта плана в инструменты визуализации: ​

Практический пример: Анализ сложного запроса ​

Заключение и рекомендации ​

EXPLAIN

Что такое EXPLAIN и зачем он нужен?

Синтаксис EXPLAIN

Параметры EXPLAIN

1. ANALYZE

2. VERBOSE

3. COSTS

4. SETTINGS

5. GENERIC_PLAN

6. BUFFERS

7. WAL

8. TIMING

9. SUMMARY

10. MEMORY

11. FORMAT `{ TEXT | XML | JSON | YAML }`

Практические комбинации параметров

Для быстрого анализа производительности:

Для глубокого анализа проблем с памятью и I/O:

Для отладки DML-операций (INSERT/UPDATE/DELETE):

Для экспорта плана в инструменты визуализации:

Практический пример: Анализ сложного запроса

Заключение и рекомендации