Re: POC, WIP: OR-clause support for indexes

Поиск
Список
Период
Сортировка
От jian he
Тема Re: POC, WIP: OR-clause support for indexes
Дата
Msg-id CACJufxG0v_Fv7bnuys2mNZGsFSb4qsQFspArnYghqm0Jswa9vw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: POC, WIP: OR-clause support for indexes  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Ответы Re: POC, WIP: OR-clause support for indexes  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers
On Mon, Feb 19, 2024 at 4:35 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
>
> In attachment - v17 for both patches. As I see it, the only general
> explanation of the idea is not addressed. I'm not sure how deeply we
> should explain it.


> On Tue, Nov 28, 2023 at 5:04 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Nov 27, 2023 at 3:02 AM Andrei Lepikhov
> <a.lepikhov@postgrespro.ru> wrote:
> > On 25/11/2023 08:23, Alexander Korotkov wrote:
> > > I think patch certainly gets better in this aspect.  One thing I can't
> > > understand is why do we use home-grown code for resolving
> > > hash-collisions.  You can just define custom hash and match functions
> > > in HASHCTL.  Even if we need to avoid repeated JumbleExpr calls, we
> > > still can save pre-calculated hash value into hash entry and use
> > > custom hash and match.  This doesn't imply us to write our own
> > > collision-resolving code.
> >
> > Thanks, it was an insightful suggestion.
> > I implemented it, and the code has become shorter (see attachment).
>
> Neither the code comments nor the commit message really explain the
> design idea here. That's unfortunate, principally because it makes
> review difficult.
>
> I'm very skeptical about the idea of using JumbleExpr for any part of
> this. It seems fairly expensive, and it might produce false matches.
> If expensive is OK, then why not just use equal()? If it's not, then
> this probably isn't really OK either. But in any case there should be
> comments explaining why this strategy was chosen.

The above message
(https://postgr.es/m/CA%2BTgmoZCgP6FrBQEusn4yaWm02XU8OPeoEMk91q7PRBgwaAkFw%40mail.gmail.com)
seems still not answered.
How can we evaluate whether JumbleExpr is expensive or not?
I used this naive script to test, but didn't find a big difference
when enable_or_transformation is ON or OFF.

`
create table test_1_100 as (select (random()*1000)::int x,
(random()*1000) y from generate_series(1,1_000_000) i);
explain(costs off, analyze)
select * from test
where x = 1 or x + 2= 3 or x + 3= 4 or x + 4= 5
or x + 5= 6 or x + 6= 7 or x + 7= 8 or x + 8= 9 or x + 9=10
or x + 10= 11 or x + 11= 12 or x + 12= 13 or x + 13= 14
or x + 14= 15 or x + 15= 16 or x + 16= 17 or x + 17= 18
or x + 18=19 or x + 19= 20 or x + 20= 21 or x + 21= 22
or x + 22= 23 or x + 23= 24 or x + 24= 25 or x + 25= 26
or x + 26= 27 or x + 27=28 or x + 28= 29 or x + 29= 30
or x + 30= 31 \watch i=0.1 c=10
`

`leftop operator rightop`
the operator can also be volatile.
Do we need to check (op_volatile(opno) == PROVOLATILE_VOLATILE) within
transformBoolExprOr?



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Kapila
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: Andrei Lepikhov
Дата:
Сообщение: Re: Optimize planner memory consumption for huge arrays