Re: btree: downlink right separator/HIKEY optimization

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: btree: downlink right separator/HIKEY optimization
Дата
Msg-id ed1ad315-1a23-4f85-aa4b-c708e501df19@iki.fi
обсуждение исходный текст
Ответ на btree: downlink right separator/HIKEY optimization  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: btree: downlink right separator/HIKEY optimization  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
On 01/11/2023 00:08, Matthias van de Meent wrote:
> calling _bt_compare in _bt_moveright in many common cases. This patch,
> when stacked on top of the prefix truncation patch, improves INSERT
> performance by an additional 2-9%pt, with an extreme case of 45% in
> the worscase index tests at [0].
> 
> The optimization is that we now recognze that our page split algorithm
> all but guarantees that the HIKEY matches this page's downlink's right
> separator key bytewise, excluding the data stored in the
> IndexTupleData struct.

Good observation.

> By caching the right separator index tuple in _bt_search, we can
> compare the downlink's right separator and the HIKEY, and when they
> are equal (memcmp() == 0) we don't have to call _bt_compare - the
> HIKEY is known to be larger than the scan key, because our key is
> smaller than the right separator, and thus transitively also smaller
> than the HIKEY because it contains the same data. As _bt_compare can
> call expensive user-provided functions, this can be a large
> performance boon, especially when there are only a small number of
> column getting compared on each page (e.g. index tuples of many 100s
> of bytes, or dynamic prefix truncation is enabled).

What would be the worst case scenario for this? One situation where the 
memcmp() would not match is when there is a concurrent page split. I 
think it's OK to pessimize that case. Are there any other situations? 
When the memcmp() matches, I think this is almost certainly not slower 
than calling the datatype's comparison function.

>         if (offnum < PageGetMaxOffsetNumber(page))
>         {
>             ItemId    rightsepitem = PageGetItemId(page, offnum + 1);
>             IndexTuple pagerightsep = (IndexTuple) PageGetItem(page, rightsepitem);
>             memcpy(rsepbuf.data, pagerightsep, ItemIdGetLength(rightsepitem));
>             rightsep = &rsepbuf.tuple;
>         }
>         else if (!P_RIGHTMOST(opaque))
>         {
>             /*
>              * The rightmost data tuple on inner page has P_HIKEY as its right
>              * separator.
>              */
>             ItemId    rightsepitem = PageGetItemId(page, P_HIKEY);
>             IndexTuple pagerightsep = (IndexTuple) PageGetItem(page, rightsepitem);
>             memcpy(rsepbuf.data, pagerightsep, ItemIdGetLength(rightsepitem));
>             rightsep = &rsepbuf.tuple;
>         }

This could use a one-line comment above this, something like "Remember 
the right separator of the downlink we follow, to speed up the next 
_bt_moveright call".

Should there be an "else rightsep = NULL;" here? Is it possible that we 
follow the non-rightmost downlink on a higher level and rightmost 
downlink on next level? Concurrent page deletion?

Please update the comment above _bt_moveright to describe the new 
argument. Perhaps the text from README should go there, this feels like 
a detail specific to _bt_search and _bt_moveright.

-- 
Heikki Linnakangas
Neon (https://neon.tech)




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Shlok Kyal
Дата:
Сообщение: Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Test 002_pg_upgrade fails with olddump on Windows