Обсуждение: BUG #17544: Join order for INNER JOIN ... USING with GROUP BY on join column affects selected query plan
BUG #17544: Join order for INNER JOIN ... USING with GROUP BY on join column affects selected query plan
От
PG Bug reporting form
Дата:
The following bug has been logged on the website: Bug reference: 17544 Logged by: Eric Sheng Email address: _@ericsheng.com PostgreSQL version: 14.4 Operating system: Ubuntu 20.04 Description: It seems like the join order of a simple INNER JOIN ... USING(...) affects query plan selection even when only joining two tables, when used with GROUP BY on the join column. Schema: groups with priority, containing zero or more tasks with a boolean CREATE TABLE groups(group_id INT PRIMARY KEY NOT NULL, priority INT); CREATE TABLE tasks(task_id INT PRIMARY KEY NOT NULL, group_id INT NOT NULL REFERENCES groups(group_id), finished BOOLEAN NOT NULL); CREATE INDEX groups_priority_index ON groups(priority, group_id); CREATE INDEX tasks_group_id_index ON tasks(group_id); Data: 1000 groups (random priority) with 1000 tasks per group DO $$ BEGIN FOR i IN 0 .. 999 LOOP INSERT INTO groups(group_id, priority) VALUES(i, FLOOR(RANDOM() * 2000000000)::INT); FOR j in 0 .. 999 LOOP INSERT INTO tasks(task_id, group_id, finished) VALUES (1000 * i + j, i, RANDOM() * 100 < 99); END LOOP; END LOOP; END; $$; Queries: fetching the lowest priority groups and whether or not all tasks in them have finished (same query twice, just different join order) EXPLAIN (ANALYZE, TIMING) SELECT group_id, BOOL_AND(finished) FROM groups INNER JOIN tasks USING(group_id) GROUP BY group_id, priority ORDER BY priority ASC LIMIT 10; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=15771.99..15772.02 rows=10 width=9) (actual time=293.054..302.919 rows=10 loops=1) -> Sort (cost=15771.99..15777.64 rows=2260 width=9) (actual time=293.051..302.914 rows=10 loops=1) Sort Key: groups.priority Sort Method: top-N heapsort Memory: 25kB -> Finalize HashAggregate (cost=15700.56..15723.16 rows=2260 width=9) (actual time=292.744..302.777 rows=1000 loops=1) Group Key: groups.group_id Batches: 1 Memory Usage: 241kB -> Gather (cost=15203.36..15677.96 rows=4520 width=9) (actual time=291.308..301.725 rows=2836 loops=1) Workers Planned: 2 Workers Launched: 2 -> Partial HashAggregate (cost=14203.36..14225.96 rows=2260 width=9) (actual time=263.344..263.560 rows=945 loops=3) Group Key: groups.group_id Batches: 1 Memory Usage: 241kB Worker 0: Batches: 1 Memory Usage: 241kB Worker 1: Batches: 1 Memory Usage: 241kB -> Hash Join (cost=60.85..11725.61 rows=495550 width=9) (actual time=0.719..182.977 rows=333333 loops=3) Hash Cond: (tasks.group_id = groups.group_id) -> Parallel Seq Scan on tasks (cost=0.00..10361.50 rows=495550 width=5) (actual time=0.044..61.366 rows=333333 loops=3) -> Hash (cost=32.60..32.60 rows=2260 width=8) (actual time=0.627..0.627 rows=1000 loops=3) Buckets: 4096 Batches: 1 Memory Usage: 72kB -> Seq Scan on groups (cost=0.00..32.60 rows=2260 width=8) (actual time=0.042..0.298 rows=1000 loops=3) Planning Time: 0.310 ms Execution Time: 303.294 ms EXPLAIN (ANALYZE, TIMING) SELECT group_id, BOOL_AND(finished) FROM tasks INNER JOIN groups USING(group_id) GROUP BY group_id, priority ORDER BY priority ASC LIMIT 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.70..1.35 rows=10 width=9) (actual time=1.146..10.606 rows=10 loops=1) -> GroupAggregate (cost=0.70..64932.27 rows=1000000 width=9) (actual time=1.144..10.599 rows=10 loops=1) Group Key: groups.priority, tasks.group_id -> Nested Loop (cost=0.70..47432.27 rows=1000000 width=9) (actual time=0.055..8.028 rows=10001 loops=1) -> Index Only Scan using groups_priority_index on groups (cost=0.28..43.27 rows=1000 width=8) (actual time=0.028..0.036 rows=11 loops=1) Heap Fetches: 0 -> Index Scan using tasks_group_id_index on tasks (cost=0.42..37.39 rows=1000 width=5) (actual time=0.017..0.433 rows=909 loops=11) Index Cond: (group_id = groups.group_id) Planning Time: 0.878 ms Execution Time: 10.672 ms All join orders for inner joins give semantically equivalent results, and documentation at https://www.postgresql.org/docs/current/explicit-joins.html indicates that the planner should explore all join orders unless there are too many tables, so I would have expected the join order here to be immaterial to the query plan chosen. Digging a bit deeper, this appears to be due to the USING clause causing the GROUP BY group_id to be rewritten to `groups.group_id` in the first query and `tasks.group_id` in the second query, and the former resulting in the simpler plan not being used. The same difference can be observed running: EXPLAIN (ANALYZE, TIMING) SELECT groups.group_id, BOOL_AND(finished) FROM tasks INNER JOIN groups ON tasks.group_id = groups.group_id GROUP BY groups.group_id, priority ORDER BY priority ASC LIMIT 10; (the slow plan) EXPLAIN (ANALYZE, TIMING) SELECT tasks.group_id, BOOL_AND(finished) FROM tasks INNER JOIN groups ON tasks.group_id = groups.group_id GROUP BY tasks.group_id, priority ORDER BY priority ASC LIMIT 10; (the fast plan)
On 7/10/22 09:52, PG Bug reporting form wrote: > All join orders for inner joins give semantically equivalent results, and > documentation at https://www.postgresql.org/docs/current/explicit-joins.html > indicates that the planner should explore all join orders unless there are > too many tables, so I would have expected the join order here to be > immaterial to the query plan chosen. > > Digging a bit deeper, this appears to be due to the USING clause causing the > GROUP BY group_id to be rewritten to `groups.group_id` in the first query > and `tasks.group_id` in the second query, and the former resulting in the > simpler plan not being used. The same difference can be observed running: > > EXPLAIN (ANALYZE, TIMING) SELECT groups.group_id, BOOL_AND(finished) > FROM tasks INNER JOIN groups ON tasks.group_id = groups.group_id GROUP BY > groups.group_id, priority ORDER BY priority ASC LIMIT 10; > (the slow plan) > > EXPLAIN (ANALYZE, TIMING) SELECT tasks.group_id, BOOL_AND(finished) FROM > tasks INNER JOIN groups ON tasks.group_id = groups.group_id GROUP BY > tasks.group_id, priority ORDER BY priority ASC LIMIT 10; > (the fast plan) Documentation speaks the truth - optimizer checks all join permutations. But USING clause doesn't pull both group_id vars and implicitly chooses only one according to an algorithm, described in parse_clause.c::buildMergedJoinVar(). So, because you use group_id in upper GROUP BY, optimizer is limited to specific set of strategies, because it must use the same grouping variable, as implicitly chosen in the JOIN USING clause. -- Regards Andrey Lepikhov Postgres Professional
PG Bug reporting form <noreply@postgresql.org> writes: > It seems like the join order of a simple INNER JOIN ... USING(...) affects > query plan selection even when only joining two tables, when used with GROUP > BY on the join column. > ... > Digging a bit deeper, this appears to be due to the USING clause causing the > GROUP BY group_id to be rewritten to `groups.group_id` in the first query > and `tasks.group_id` in the second query, and the former resulting in the > simpler plan not being used. Yeah. In principle that shouldn't matter, but a confluence of peculiarities of this specific scenario and a perhaps-overly-aggressive optimization on the GROUP BY clause prevent the planner from finding the good plan when the JOIN USING variable is resolved as coming from the same table as the other GROUP BY column. Technical details at [1]. I'm not sure whether we'll end up using the quick-hack patch shown there, but if you're desperate for a fix you could apply that locally. regards, tom lane [1] https://www.postgresql.org/message-id/1657885.1657647073%40sss.pgh.pa.us