Re: Use generation memory context for tuplestore.c

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Use generation memory context for tuplestore.c
Дата
Msg-id CAApHDvr72_Nur=3Zq3_R-V5JB8u-EFeApTQ2mFVDOtB59=aS=w@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Use generation memory context for tuplestore.c  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
On Sat, 4 May 2024 at 03:51, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> On Fri, 3 May 2024 at 15:55, David Rowley <dgrowleyml@gmail.com> wrote:
> > master @ 8f0a97dff
> > Storage: Memory  Maximum Storage: 16577kB
> >
> > patched:
> > Storage: Memory  Maximum Storage: 8577kB
>
> Those are some impressive numbers.

This patch needed to be rebased, so updated patches are attached.

I was also reflecting on what Bruce wrote in [1] about having to parse
performance numbers from the commit messages, so I decided to adjust
the placeholder commit message I'd written to make performance numbers
more clear to Bruce, or whoever does the next major version release
notes.  That caused me to experiment with finding the best case for
this patch.  I could scale the improvement much further than I have,
but here's something I came up with that's easy to reproduce.

create table winagg (a int, b text);
insert into winagg select a,repeat('a', 1024) from generate_series(1,10000000)a;
set work_mem = '1MB';
set jit=0;
explain (analyze, timing off) select sum(l1),sum(l2) from (select
length(b) l1,length(lag(b, 800) over ()) as l2 from winagg limit
1600);

master:
Execution Time: 6585.685 ms

patched:
Execution Time: 4.159 ms

1583x faster.

I've effectively just exploited the spool_tuples() behaviour of what
it does when the tuplestore goes to disk to have it spool the entire
remainder of the partition, which is 10 million rows.  I'm just taking
a tiny portion of those with the LIMIT 1600.  I just set work_mem to
something that the patched version won't have the tuplestore spill to
disk so that spool_tuples() only spools what's needed in the patched
version. So, artificial is a word you could use, but certainly,
someone could find this performance cliff in the wild and be prevented
from falling off it by this patch.

David

[1] https://www.postgresql.org/message-id/Zk5r2XyI0BhXLF8h%40momjian.us

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Zaid Shabbir
Дата:
Сообщение: Cluster forcefully removal without user input
Следующее
От: Daniel Gustafsson
Дата:
Сообщение: Re: Fix possible dereference null pointer (PQprint)