Re: libpq compression

Поиск
Список
Период
Сортировка
От Daniil Zakhlystov
Тема Re: libpq compression
Дата
Msg-id 161609580905.28624.5304095609680400810.pgcf@coridan.postgresql.org
обсуждение исходный текст
Ответ на libpq compression  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Ответы Re: libpq compression  (Justin Pryzby <pryzby@telsasoft.com>)
Список pgsql-hackers
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:       tested, passed
Spec compliant:           tested, passed
Documentation:            tested, passed

Hi,

I've compared the different libpq compression approaches in the streaming physical replication scenario.

Test setup
Three hosts: first is used for pg_restore run, second is master, third is the standby replica.
In each test run, I've run the pg_restore of the IMDB database
(https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/2QYZBT)
 
and measured the received traffic on the standby replica.

Also, I've enlarged the ZPQ_BUFFER_SIZE buffer in all versions because too small buffer size (8192 bytes) lead to more

system calls to socket read/write and poor compression in the chunked-reset scenario.

Scenarios:

chunked
use streaming compression, wrap compressed data into CompressedData messages and preserve the compression context
betweenmultiple CompressedData messages.
 
https://github.com/usernamedt/libpq_compression/tree/chunked-compression

chunked-reset
use streaming compression, wrap compressed data into CompressedData messages and reset the compression context on each
CompressedDatamessage.
 
https://github.com/usernamedt/libpq_compression/tree/chunked-reset

permanent
use streaming compression, send raw compressed stream without any wrapping
https://github.com/usernamedt/libpq_compression/tree/permanent-w-enlarged-buffer

Tested compression levels
ZSTD, level 1
ZSTD, level 5
ZSTD, level 9

Scenario            Replica rx, mean, MB
uncompressed    6683.6


ZSTD, level 1
Scenario            Replica rx, mean, MB
chunked-reset        2726
chunked            2694
permanent        2694.3

ZSTD, level 5
Scenario            Replica rx, mean, MB
chunked-reset        2234.3
chunked            2123
permanent        2115.3

ZSTD, level 9
Scenario            Replica rx, mean, MB
chunked-reset        2153.6
chunked            1943
permanent        1941.6

Full report with additional data and resource usage graphs is available here
https://docs.google.com/document/d/1a5bj0jhtFMWRKQqwu9ag1PgDF5fLo7Ayrw3Uh53VEbs

Based on these results, I suggest sticking with chunked compression approach
which introduces more flexibility and contains almost no overhead compared to permanent compression.
Also, later we may introduce some setting to control should we reset the compression context in each message without
breaking the backward compatibility.

--
Daniil Zakhlystov

The new status of this patch is: Ready for Committer

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: GROUP BY DISTINCT
Следующее
От: Robert Haas
Дата:
Сообщение: Re: [HACKERS] Custom compression methods