Can JoinFilter condition be pushed down into IndexScan?

Поиск

Список

Период

Сортировка

От	Bəxtiyar Neyman
Тема	Can JoinFilter condition be pushed down into IndexScan?
Дата	21 июня 2023 г. 06:37:00
Msg-id	CAObnsGr0UX2zWrBeY6hsV2eP2c6drUkTL4uAUzj8+mHwVu0w3A@mail.gmail.com обсуждение исходный текст
Ответы	Re: Can JoinFilter condition be pushed down into IndexScan? (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Список	pgsql-hackers

Дерево обсуждения

I define a table user_ranks as such:

CREATE TABLE user_ranks (
id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
rank INTEGER NOT NULL,
CONSTRAINT "by (rank, id)" UNIQUE (rank, id)
);

INSERT INTO user_ranks (user_id, rank) SELECT generate_series(1, 10000), generate_series(1, 10000);

Here's a query I'd like to optimize:

explain (analyze,verbose)

SELECT
t3_0."id" AS "id",
t3_0."rank" AS "rank"
FROM
LATERAL (
SELECT
t4_0."rank" AS "rank"
FROM
user_ranks AS t4_0
WHERE
(t4_0."id" = 4732455)
) AS t3_1
INNER JOIN user_ranks AS t3_0 ON true
WHERE
(
((t3_0."rank", t3_0."id") <= (t3_1."rank", 4732455))
AND true
)
ORDER BY
t3_0."rank" DESC,
t3_0."id" DESC
LIMIT
10

It compiles to the following plan:

Limit (cost=0.56..250.94 rows=10 width=12) (actual time=8.078..8.078 rows=1 loops=1)
Output: t3_0.id, t3_0.rank
-> Nested Loop (cost=0.56..41763.27 rows=1668 width=12) (actual time=8.075..8.076 rows=1 loops=1)
Output: t3_0.id, t3_0.rank
Inner Unique: true
Join Filter: (ROW(t3_0.rank, t3_0.id) <= ROW(t4_0.rank, 4732455))
Rows Removed by Join Filter: 5002
-> Index Only Scan Backward using "by (rank,id)" on public.user_ranks t3_0 (cost=0.28..163.33 rows=5003 width=12) (actual time=0.023..0.638 rows=5003 loops=1)
Output: t3_0.rank, t3_0.id
Heap Fetches: 0
-> Index Scan using "by id" on public.user_ranks t4_0 (cost=0.28..8.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=5003)
Output: t4_0.id, t4_0.rating, t4_0.rank
Index Cond: (t4_0.id = 4732455)

As you can see, there are a lot of rows returned by t3_0, which are then filtered by Join Filter. But it would have been better if instead of the filter, the t3_0 table would have an Index Cond. Similar to how it happens when a correlated subquery is used (or a CTE)

explain (analyze,verbose)
SELECT
t3_0."id" AS "id",
t3_0."rank" AS "rank"
FROM
user_ranks AS t3_0
WHERE
(
((t3_0."rank", t3_0."id") <= (
SELECT
t4_0."rank" AS "rank",
t4_0."id" AS "id"
FROM
user_ranks AS t4_0
WHERE
(t4_0."id" = 4732455)
))
AND true
)
ORDER BY
t3_0."rank" DESC,
t3_0."id" DESC
LIMIT
10

Limit (cost=8.58..8.95 rows=10 width=12) (actual time=0.062..0.064 rows=1 loops=1)
Output: t3_0.id, t3_0.rank
InitPlan 1 (returns $0,$1)
-> Index Scan using "by id" on public.user_ranks t4_0 (cost=0.28..8.30 rows=1 width=12) (actual time=0.024..0.025 rows=1 loops=1)
Output: t4_0.rank, t4_0.id
Index Cond: (t4_0.id = 4732455)
-> Index Only Scan Backward using "by (rank,id)" on public.user_ranks t3_0 (cost=0.28..61.47 rows=1668 width=12) (actual time=0.061..0.062 rows=1 loops=1)
Output: t3_0.id, t3_0.rank
Index Cond: (ROW(t3_0.rank, t3_0.id) <= ROW($0, $1))
Heap Fetches: 0

I'm an opposite of a PostgreSQL expert, but it was surprising to me to see that a correlated subquery behaves better than a join. Is this normal? Is it something worth fixing/easy to fix?

Sincerely,

Bakhtiyar

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Amit Kapila
Дата: 21 июня 2023 г., 06:24:48
Сообщение: Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call

Следующее

От: Tom Lane
Дата: 21 июня 2023 г., 06:39:31
Сообщение: Re: [PATCH] hstore: Fix parsing on Mac OS X: isspace() is locale specific

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Can JoinFilter condition be pushed down into IndexScan?

Предыдущее

Следующее