Re: BRIN index which is much faster never chosen by planner

Поиск
Список
Период
Сортировка
От Jeremy Finzel
Тема Re: BRIN index which is much faster never chosen by planner
Дата
Msg-id CAMa1XUhdoKchMmbeSef7HhRM44-_00eMt8LPeBuwPA4UrSNc-Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BRIN index which is much faster never chosen by planner  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: BRIN index which is much faster never chosen by planner  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Oct 10, 2019 at 7:22 PM David Rowley <david.rowley@2ndquadrant.com> wrote:
The planner might be able to get a better estimate on the number of
matching rows if the now() - interval '10 days' expression was
replaced with 'now'::timestamptz - interval '10 days'. However, care
would need to be taken to ensure the plan is never prepared since
'now' is evaluated during parse. The same care must be taken when
creating views, functions, stored procedures and the like.

You are on to something here I think with the now() function, even if above suggestion is not exactly right as you said further down.  I am finding a hard-coded timestamp gives the right query plan.  I also tested same with even bigger window (last 16 days) and it yet still chooses the brin index.

foo_prod=# EXPLAIN
foo_prod-# SELECT
foo_prod-#  category, source, MIN(rec_insert_time) OVER (partition by source order by rec_insert_time) AS first_source_time, MAX(rec_insert_time) OVER (partition by source order by rec_insert_time) AS last_source_time
foo_prod-# FROM (SELECT DISTINCT ON (brand_id, last_change, log_id)
foo_prod(# category, source(field1) AS source, rec_insert_time
foo_prod(# FROM log_table l
foo_prod(# INNER JOIN public.small_join_table filter ON filter.category = l.category
foo_prod(# WHERE field1 IS NOT NULL AND l.category = 'music'
foo_prod(# AND l.rec_insert_time >= now() - interval '10 days'
foo_prod(# ORDER BY brand_id, last_change, log_id, rec_insert_time DESC) unique_cases;
                                                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=24436329.10..24436343.56 rows=643 width=120)
   ->  Sort  (cost=24436329.10..24436330.70 rows=643 width=104)
         Sort Key: unique_cases.source, unique_cases.rec_insert_time
         ->  Subquery Scan on unique_cases  (cost=24436286.24..24436299.10 rows=643 width=104)
               ->  Unique  (cost=24436286.24..24436292.67 rows=643 width=124)
                     ->  Sort  (cost=24436286.24..24436287.85 rows=643 width=124)
                           Sort Key: l.brand_id, l.last_change, l.log_id, l.rec_insert_time DESC
                           ->  Nested Loop  (cost=0.00..24436256.25 rows=643 width=124)
                                 Join Filter: ((l.category)::text = filter.category)
                                 ->  Seq Scan on small_join_table filter  (cost=0.00..26.99 rows=1399 width=8)
                                 ->  Materialize  (cost=0.00..24420487.02 rows=643 width=99)
                                       ->  Seq Scan on log_table l  (cost=0.00..24420483.80 rows=643 width=99)
                                             Filter: ((field1 IS NOT NULL) AND (category = 'music'::name) AND (rec_insert_time >= (now() - '10 days'::interval)))
(13 rows)

foo_prod=# SELECT now() - interval '10 days';
           ?column?
-------------------------------
 2019-10-01 08:20:38.115471-05
(1 row)

foo_prod=# EXPLAIN
SELECT
 category, source, MIN(rec_insert_time) OVER (partition by source order by rec_insert_time) AS first_source_time, MAX(rec_insert_time) OVER (partition by source order by rec_insert_time) AS last_source_time
FROM (SELECT DISTINCT ON (brand_id, last_change, log_id)
category, source(field1) AS source, rec_insert_time
FROM log_table l
INNER JOIN public.small_join_table filter ON filter.category = l.category
WHERE field1 IS NOT NULL AND l.category = 'music'
AND l.rec_insert_time >= '2019-10-01 08:20:38.115471-05'
ORDER BY brand_id, last_change, log_id, rec_insert_time DESC) unique_cases;
                                                                              QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 WindowAgg  (cost=19664576.17..19664590.63 rows=643 width=120)
   ->  Sort  (cost=19664576.17..19664577.77 rows=643 width=104)
         Sort Key: unique_cases.source, unique_cases.rec_insert_time
         ->  Subquery Scan on unique_cases  (cost=19664533.31..19664546.17 rows=643 width=104)
               ->  Unique  (cost=19664533.31..19664539.74 rows=643 width=124)
                     ->  Sort  (cost=19664533.31..19664534.92 rows=643 width=124)
                           Sort Key: l.brand_id, l.last_change, l.log_id, l.rec_insert_time DESC
                           ->  Nested Loop  (cost=3181.19..19664503.32 rows=643 width=124)
                                 ->  Gather  (cost=3180.91..19662574.92 rows=643 width=99)
                                       Workers Planned: 3
                                       ->  Parallel Bitmap Heap Scan on log_table l  (cost=2180.91..19661510.62 rows=207 width=99)
                                             Recheck Cond: (rec_insert_time >= '2019-10-01 08:20:38.115471-05'::timestamp with time zone)
                                             Filter: ((field1 IS NOT NULL) AND (category = 'music'::name))
                                             ->  Bitmap Index Scan on rec_insert_time_brin_1000  (cost=0.00..2180.75 rows=142602171 width=0)
                                                   Index Cond: (rec_insert_time >= '2019-10-01 08:20:38.115471-05'::timestamp with time zone)


Let me know if this rings any bells!  I will respond to other comments with other replies.

Thanks,
Jeremy
 

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Konstantin Knizhnik
Дата:
Сообщение: Re: [Proposal] Global temporary tables
Следующее
От: Jeremy Finzel
Дата:
Сообщение: Re: BRIN index which is much faster never chosen by planner