Re: POC, WIP: OR-clause support for indexes

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: POC, WIP: OR-clause support for indexes
Дата
Msg-id CAH2-WzmD5u5kCZG0qMtVySz8VB1_drOiX=j0buDufK-EJc3YkQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC, WIP: OR-clause support for indexes  (Alena Rybakina <lena.ribackina@yandex.ru>)
Ответы Re: POC, WIP: OR-clause support for indexes  (Alena Rybakina <lena.ribackina@yandex.ru>)
Список pgsql-hackers
On Sun, Jun 25, 2023 at 6:48 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
> I finished writing the code patch for transformation "Or" expressions to "Any" expressions.

This seems interesting to me. I'm currently working on improving
nbtree's "native execution of ScalarArrayOpExpr quals" (see commit
9e8da0f7 for background information). That is relevant to what you're
trying to do here.

Right now nbtree's handling of ScalarArrayOpExpr is rather
inefficient. The executor does pass the index scan an array of
constants, so the whole structure already allows the nbtree code to
execute the ScalarArrayOpExpr in whatever way would be most efficient.
There is only one problem: it doesn't really try to do so. It more or
less just breaks down the large ScalarArrayOpExpr into "mini" queries
-- one per constant. Internally, query execution isn't significantly
different to executing many of these "mini" queries independently. We
just sort and deduplicate the arrays. We don't intelligently decide
which pages dynamically. This is related to skip scan.

Attached is an example query that shows the problem. Right now the
query needs to access a buffer containing an index page a total of 24
times. It's actually accessing the same 2 pages 12 times. My draft
patch only requires 2 buffer accesses -- because it "coalesces the
array constants together" dynamically at run time. That is a little
extreme, but it's certainly possible.

BTW, this project is related to skip scan. It's part of the same
family of techniques -- MDAM techniques. (I suppose that that's
already true for ScalarArrayOpExpr execution by nbtree, but without
dynamic behavior it's not nearly as valuable as it could be.)

If executing ScalarArrayOpExprs was less inefficient in these cases
then the planner could be a lot more aggressive about using them.
Seems like these executor improvements might go well together with
what you're doing in the planner. Note that I have to "set
random_page_cost=0.1" to get the planner to use all of the quals from
the query as index quals. It thinks (correctly) that the query plan is
very inefficient. That happens to match reality right now, but the
underlying reality could change significantly. Something to think
about.

--
Peter Geoghegan

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Speeding Up Bitmapset
Следующее
От: Kirk Wolak
Дата:
Сообщение: Re: RFC: Adding \history [options] [filename] to psql (Snippets and Shared Queries)