Re: Compress ReorderBuffer spill files using LZ4

Поиск
Список
Период
Сортировка
On 2024-Jun-06, Amit Kapila wrote:

> On Thu, Jun 6, 2024 at 4:28 PM Julien Tachoires <julmon@gmail.com> wrote:
> >
> > When the content of a large transaction (size exceeding
> > logical_decoding_work_mem) and its sub-transactions has to be
> > reordered during logical decoding, then, all the changes are written
> > on disk in temporary files located in pg_replslot/<slot_name>.
> > Decoding very large transactions by multiple replication slots can
> > lead to disk space saturation and high I/O utilization.

I like the general idea of compressing the output of logical decoding.
It's not so clear to me that we only want to do so for spilling to disk;
for instance, if the two nodes communicate over a slow network, it may
even be beneficial to compress when streaming, so to this question:

> Why can't one use 'streaming' option to send changes to the client
> once it reaches the configured limit of 'logical_decoding_work_mem'?

I would say that streaming doesn't necessarily have to mean we don't
want compression, because for some users it might be beneficial.

I think a GUC would be a good idea.  Also, what if for whatever reason
you want a different compression algorithm or different compression
parameters?  Looking at the existing compression UI we offer in
pg_basebackup, perhaps you could add something like this:

compress_logical_decoding = none
compress_logical_decoding = lz4:42
compress_logical_decoding = spill-zstd:99

"none" says to never use compression (perhaps should be the default),
"lz4:42" says to use lz4 with parameters 42 on both spilling and
streaming, and "spill-zstd:99" says to use Zstd with parameter 99 but
only for spilling to disk.

(I don't mean to say that you should implement Zstd compression with
this patch, only that you should choose the implementation so that
adding Zstd support (or whatever) later is just a matter of adding some
branches here and there.  With the current #ifdef you propose, it's hard
to do that.  Maybe separate the parts that depend on the specific
algorithm to algorithm-agnostic functions.)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [multithreading] extension compatibility
Следующее
От: Robert Haas
Дата:
Сообщение: Re: Things I don't like about \du's "Attributes" column