Re: BRIN index which is much faster never chosen by planner

Поиск

Список

Период

Сортировка

От	Jeremy Finzel
Тема	Re: BRIN index which is much faster never chosen by planner
Дата	11 октября 2019 г. 17:08:05
Msg-id	CAMa1XUhdoKchMmbeSef7HhRM44-_00eMt8LPeBuwPA4UrSNc-Q@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BRIN index which is much faster never chosen by planner (David Rowley <david.rowley@2ndquadrant.com>)
Ответы	Re: BRIN index which is much faster never chosen by planner (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список	pgsql-hackers

Дерево обсуждения

On Thu, Oct 10, 2019 at 7:22 PM David Rowley <david.rowley@2ndquadrant.com> wrote:

The planner might be able to get a better estimate on the number of
matching rows if the now() - interval '10 days' expression was
replaced with 'now'::timestamptz - interval '10 days'. However, care
would need to be taken to ensure the plan is never prepared since
'now' is evaluated during parse. The same care must be taken when
creating views, functions, stored procedures and the like.

You are on to something here I think with the now() function, even if above suggestion is not exactly right as you said further down. I am finding a hard-coded timestamp gives the right query plan. I also tested same with even bigger window (last 16 days) and it yet still chooses the brin index.

foo_prod=# EXPLAIN
foo_prod-# SELECT
foo_prod-# category, source, MIN(rec_insert_time) OVER (partition by source order by rec_insert_time) AS first_source_time, MAX(rec_insert_time) OVER (partition by source order by rec_insert_time) AS last_source_time
foo_prod-# FROM (SELECT DISTINCT ON (brand_id, last_change, log_id)
foo_prod(# category, source(field1) AS source, rec_insert_time
foo_prod(# FROM log_table l
foo_prod(# INNER JOIN public.small_join_table filter ON filter.category = l.category
foo_prod(# WHERE field1 IS NOT NULL AND l.category = 'music'
foo_prod(# AND l.rec_insert_time >= now() - interval '10 days'
foo_prod(# ORDER BY brand_id, last_change, log_id, rec_insert_time DESC) unique_cases;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=24436329.10..24436343.56 rows=643 width=120)
-> Sort (cost=24436329.10..24436330.70 rows=643 width=104)
Sort Key: unique_cases.source, unique_cases.rec_insert_time
-> Subquery Scan on unique_cases (cost=24436286.24..24436299.10 rows=643 width=104)
-> Unique (cost=24436286.24..24436292.67 rows=643 width=124)
-> Sort (cost=24436286.24..24436287.85 rows=643 width=124)
Sort Key: l.brand_id, l.last_change, l.log_id, l.rec_insert_time DESC
-> Nested Loop (cost=0.00..24436256.25 rows=643 width=124)
Join Filter: ((l.category)::text = filter.category)
-> Seq Scan on small_join_table filter (cost=0.00..26.99 rows=1399 width=8)
-> Materialize (cost=0.00..24420487.02 rows=643 width=99)
-> Seq Scan on log_table l (cost=0.00..24420483.80 rows=643 width=99)
Filter: ((field1 IS NOT NULL) AND (category = 'music'::name) AND (rec_insert_time >= (now() - '10 days'::interval)))
(13 rows)

foo_prod=# SELECT now() - interval '10 days';
?column?
-------------------------------
2019-10-01 08:20:38.115471-05
(1 row)

foo_prod=# EXPLAIN
SELECT
category, source, MIN(rec_insert_time) OVER (partition by source order by rec_insert_time) AS first_source_time, MAX(rec_insert_time) OVER (partition by source order by rec_insert_time) AS last_source_time
FROM (SELECT DISTINCT ON (brand_id, last_change, log_id)
category, source(field1) AS source, rec_insert_time
FROM log_table l
INNER JOIN public.small_join_table filter ON filter.category = l.category
WHERE field1 IS NOT NULL AND l.category = 'music'
AND l.rec_insert_time >= '2019-10-01 08:20:38.115471-05'
ORDER BY brand_id, last_change, log_id, rec_insert_time DESC) unique_cases;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
WindowAgg (cost=19664576.17..19664590.63 rows=643 width=120)
-> Sort (cost=19664576.17..19664577.77 rows=643 width=104)
Sort Key: unique_cases.source, unique_cases.rec_insert_time
-> Subquery Scan on unique_cases (cost=19664533.31..19664546.17 rows=643 width=104)
-> Unique (cost=19664533.31..19664539.74 rows=643 width=124)
-> Sort (cost=19664533.31..19664534.92 rows=643 width=124)
Sort Key: l.brand_id, l.last_change, l.log_id, l.rec_insert_time DESC
-> Nested Loop (cost=3181.19..19664503.32 rows=643 width=124)
-> Gather (cost=3180.91..19662574.92 rows=643 width=99)
Workers Planned: 3
-> Parallel Bitmap Heap Scan on log_table l (cost=2180.91..19661510.62 rows=207 width=99)
Recheck Cond: (rec_insert_time >= '2019-10-01 08:20:38.115471-05'::timestamp with time zone)
Filter: ((field1 IS NOT NULL) AND (category = 'music'::name))
-> Bitmap Index Scan on rec_insert_time_brin_1000 (cost=0.00..2180.75 rows=142602171 width=0)
Index Cond: (rec_insert_time >= '2019-10-01 08:20:38.115471-05'::timestamp with time zone)

Let me know if this rings any bells! I will respond to other comments with other replies.

Thanks,

Jeremy

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Konstantin Knizhnik
Дата: 11 октября 2019 г., 16:50:07
Сообщение: Re: [Proposal] Global temporary tables

Следующее

От: Jeremy Finzel
Дата: 11 октября 2019 г., 17:19:33
Сообщение: Re: BRIN index which is much faster never chosen by planner

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BRIN index which is much faster never chosen by planner

Предыдущее

Следующее