Обсуждение: XLog size reductions: Reduced XLog record header size for PG17

Поиск

Список

Период

Сортировка

XLog size reductions: Reduced XLog record header size for PG17

От

Matthias van de Meent

Дата:

20 июня 2023 г., 23:01:00

Hi,

As I've mentioned earlier on-list [0][1] and off-list, I've been
working on reducing the volume of WAL that we write. This is one
intermediate step towards that effort.

Attached is a patchset that reduces the storage overhead of each WAL
record, with theoretical total savings in pgbench transactions
somewhere between 4-5%, depending on platform and the locality of
updates. These savings are achieved by reducing the amount of data
stored, and by not emitting bytes that don't carry data relevant to
each record's redo- and decode routines.

Patches contained:
0001 is copied essentially verbatim from [1] and reduces overhead in
the registered block's length field where possible. It is included to
improve code commonality between varcoded integer fields. See [1] for
more details.

0002 updates accesses to the rmgr-managed bits of xl_info with its own
macro returning only rmgr-managed bits, and updates XLogRecGetInfo()
to return only the xlog-managed bits.

0003 renames the rm_identify argument from 'info' to 'rmgrinfo'; and
stops providing the xlog-managed bits to the function - rmgrs have no
need to know the xlog internal info bits.

0004 continues on 0003 and moves the rmgr info bits into their own
xl_rmgrinfo of type uint8, stored in the alignment hole in the
XLogRecord struct.

0005 updates the code to only include a valid XID in the record when
the rmgr actually needs to use that XID.

0006 implements a new, variable length, WAL record header format,
previously discussed at [0] and [2]. This new WAL record header is a
minimum of 14 bytes large, but generally will be 15 to 21 bytes in
size, depending on the data contained, the type of record, and whether
the record needs an XID.

Notes:
- The patchset includes [1] for its variable-length encoding of
uint32, and this is included in the savings calculation.
- Not all records include the backend's XID anymore. XLog API users
must explicitly request the inclusion of XID in the record with the
XLOG_INCLUDE_XID record flag.
- XLog records are now always aligned to 8 bytes. This was needed to
reduce the complexity of var-coding the record length on 32-bit
systems. Savings on 32-bit systems still exist, but can be expected to
be less impactful.
- XLog length is now varlength encoded. No more records with <255
bytes of data storing 3 0-bytes - the length is now stored in 0, 1, 2
or 4 bytes.
- RMGRs now get their own uint8 info/flags field. No more sharing bits
with WAL infrastructure in xl_info. The byte is only stored if it is
non-0, and otherwise omitted (indicated by flag bits in xl_info).

Todo:
- Check for any needed documentation / code comments updates
- benchmark this

Future work:
- we could omit XLR_BLOCK_ID_DATA_[LONG,SHORT] if it is the only
"block ID" in the record (such as checkpoint records, commit/rollback
records, etc.). This would be indicated by a xl_info bit, and this
would save 2-5 bytes per applicable record.
- This patch inherits [1]'s property in which we can release the
BKPBLOCK_HAS_DATA flag bit (its value is already implied by
XLR_BLOCKID_SZCLASS), allowing us to use it for something else, like
indicating varcoded RelFileLocator/BlockId.

Kind regards,

Matthias van de Meent
Neon, Inc.

[0]
https://www.postgresql.org/message-id/flat/CAEze2Whf%3DfwAj7rosf6aDM9t%2B7MU1w-bJn28HFWYGkz%2Bics-hg%40mail.gmail.com
[1] https://www.postgresql.org/message-id/flat/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ%40mail.gmail.com
[2] https://www.postgresql.org/message-id/flat/CA+Tgmoaa9Yc9O-FP4vS_xTKf8Wgy8TzHpjnjN56_ShKE=jrP-Q@mail.gmail.com

Вложения

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Matthias van de Meent

Дата:

30 июня 2023 г., 18:36:40

Hi,

The attached v2 patchset contains some small fixes for the failing
cfbot 32-bit tests - at least locally it does so.

I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
patch of the set, which has now been updated to XLP_ALIGN as well.
Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
systems.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

Вложения

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Matthias van de Meent

Дата:

03 июля 2023 г., 14:08:31

On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> Hi,
>
> The attached v2 patchset contains some small fixes for the failing
> cfbot 32-bit tests - at least locally it does so.
>
> I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
> patch of the set, which has now been updated to XLP_ALIGN as well.
> Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
> TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
> systems.

Apparently there was some usage of MAXALIGN() in xlogreader that I'd
missed, and which only shows up in TAP tests. In v3 I've fixed that,
together with some improved early detection of invalid record headers.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

Вложения

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Matthias van de Meent

Дата:

12 июля 2023 г., 15:50:52

On Mon, 3 Jul 2023 at 13:08, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> >
> > Hi,
> >
> > The attached v2 patchset contains some small fixes for the failing
> > cfbot 32-bit tests - at least locally it does so.
> >
> > I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
> > patch of the set, which has now been updated to XLP_ALIGN as well.
> > Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
> > TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
> > systems.
>
> Apparently there was some usage of MAXALIGN() in xlogreader that I'd
> missed, and which only shows up in TAP tests. In v3 I've fixed that,
> together with some improved early detection of invalid record headers.

Another fix for CFBot - pg_waldump tests which were added in 96063e28
exposed an issue in my patchset related to RM_INVALID_ID.

v4 splits former patch 0006 into two: now 0006 adds RM_INVALID and
does the rmgr-related changes in the code, and 0007 does the WAL disk
format overhaul.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech/)

Вложения

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Matthias van de Meent

Дата:

19 сентября 2023 г., 13:07:07

On Wed, 12 Jul 2023 at 14:50, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> On Mon, 3 Jul 2023 at 13:08, Matthias van de Meent
> <boekewurm+postgres@gmail.com> wrote:
> >
> > On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent
> > <boekewurm+postgres@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > The attached v2 patchset contains some small fixes for the failing
> > > cfbot 32-bit tests - at least locally it does so.
> > >
> > > I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last
> > > patch of the set, which has now been updated to XLP_ALIGN as well.
> > > Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of
> > > TYPEALIGN so that we don't lose bits of the aligned value in 32-bit
> > > systems.
> >
> > Apparently there was some usage of MAXALIGN() in xlogreader that I'd
> > missed, and which only shows up in TAP tests. In v3 I've fixed that,
> > together with some improved early detection of invalid record headers.
>
> Another fix for CFBot - pg_waldump tests which were added in 96063e28
> exposed an issue in my patchset related to RM_INVALID_ID.
>
> v4 splits former patch 0006 into two: now 0006 adds RM_INVALID and
> does the rmgr-related changes in the code, and 0007 does the WAL disk
> format overhaul.

V5 is a rebased version of v4, and includes the latest patch from
"smaller XLRec block header" [0] as 0001.

Kind regards,

Matthias van de Meent

[0] https://www.postgresql.org/message-id/CAEze2WhG_qvs0_HPCKyGLjFSSeiLZJcFhT%3DrzEUd7AzyxnSfKw%40mail.gmail.com

Вложения

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Michael Paquier

Дата:

20 сентября 2023 г., 08:06:25

On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote:
> V5 is a rebased version of v4, and includes the latest patch from
> "smaller XLRec block header" [0] as 0001.

0001 and 0007 are the meat of the changes.

-#define XLR_CHECK_CONSISTENCY  0x02
+#define XLR_CHECK_CONSISTENCY  (0x20)

I can't help but notice that there are a few stylistic choices like
this one that are part of the patch.  Using parenthesis in the case of
hexa values is inconsistent with the usual practices I've seen in the
tree.

 #define COPY_HEADER_FIELD(_dst, _size)            \
     do {                                        \
-        if (remaining < _size)                    \
+        if (remaining < (_size))                \
             goto shortdata_err;                    \

There are a couple of stylistic changes like this one, that I guess
could just use their own patch to make these macros easier to use.

-#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
+#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & XLR_INFO_MASK)
+#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & XLR_RMGR_INFO_MASK)

This stuff in 0002 is independent of 0001, am I right?  Doing this
split with an extra macro is okay by me, reducing the presence of
XLR_INFO_MASK and bitwise operations based on it.

0003 is also mechanical, but if you begin to enforce the use of
XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR
identity callback, we should have at least a validity check to make
sure that nothing, even custom RMGRs, pass down unexpected bits?

I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and
I fear that people are going to forget to set it.  Wouldn't it be
better to use an option where the XID is excluded instead, making the
inclusing the an XID the default?

> The resource manager has ID = 0, thus requiring some special
> handling in other code. Apart from being generally useful, it is
> used in future patches to detect the end of wal in lieu of a zero-ed
> fixed-size xl_tot_len field.

Err, no, that may not be true.  See for example this thread where the
topic of improving the checks of xl_tot_len and rely on this value on
when a record header has been validated, even across page borders:
https://www.postgresql.org/message-id/17928-aa92416a70ff44a2@postgresql.org

Except that, in which cases could an invalid RMGR be useful?
--
Michael

Вложения

signature.asc

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Matthias van de Meent

Дата:

25 сентября 2023 г., 20:40:00

On Wed, 20 Sept 2023 at 07:06, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote:
> > V5 is a rebased version of v4, and includes the latest patch from
> > "smaller XLRec block header" [0] as 0001.
>
> 0001 and 0007 are the meat of the changes.

Correct.

> -#define XLR_CHECK_CONSISTENCY  0x02
> +#define XLR_CHECK_CONSISTENCY  (0x20)
>
> I can't help but notice that there are a few stylistic choices like
> this one that are part of the patch.  Using parenthesis in the case of
> hexa values is inconsistent with the usual practices I've seen in the
> tree.

Yes, I'll take another look at that.

>  #define COPY_HEADER_FIELD(_dst, _size)            \
>      do {                                        \
> -        if (remaining < _size)                    \
> +        if (remaining < (_size))                \
>              goto shortdata_err;                    \
>
> There are a couple of stylistic changes like this one, that I guess
> could just use their own patch to make these macros easier to use.

They actually fix complaints of my IDE, but are otherwise indeed stylistic.

> -#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
> +#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & XLR_INFO_MASK)
> +#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & XLR_RMGR_INFO_MASK)
>
> This stuff in 0002 is independent of 0001, am I right?  Doing this
> split with an extra macro is okay by me, reducing the presence of
> XLR_INFO_MASK and bitwise operations based on it.

Yes, that change is to stop making use of (~XLR_INFO_MASK) where
XLR_RMGR_INFO_MASK is the correct bitmask (whilst also being quite
useful in the later patch).

> 0003 is also mechanical, but if you begin to enforce the use of
> XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR
> identity callback, we should have at least a validity check to make
> sure that nothing, even custom RMGRs, pass down unexpected bits?

I think that's already handled in XLogInsert(), but I'll make sure to
add more checks if they're not in place yet.

> I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and
> I fear that people are going to forget to set it.  Wouldn't it be
> better to use an option where the XID is excluded instead, making the
> inclusing the an XID the default?

Most rmgrs don't actually use the XID. Only XACT, MULTIXACT, HEAP,
HEAP2, and LOGICALMSG use the xid, so I thought it would be easier to
just find the places where those RMGR's records were being logged than
to update all other places.

I don't mind changing how we decide to log the XID, but I don't think
EXCLUDE_XID is a good alternative: most records just don't need the
transaction ID. There are many more index AMs with logging than table
AMs, so I don't think it is that weird to default to 'not included'.

> > The resource manager has ID = 0, thus requiring some special
> > handling in other code. Apart from being generally useful, it is
> > used in future patches to detect the end of wal in lieu of a zero-ed
> > fixed-size xl_tot_len field.
>
> Err, no, that may not be true.  See for example this thread where the
> topic of improving the checks of xl_tot_len and rely on this value on
> when a record header has been validated, even across page borders:
> https://www.postgresql.org/message-id/17928-aa92416a70ff44a2@postgresql.org

Yes, there are indeed exceptions when reusing WAL segments, but it's
still a good canary, like xl_tot_len before this patch.

> Except that, in which cases could an invalid RMGR be useful?

A sentinel value that is obviously invalid is available for several
types, e.g. BlockNumber, TransactionId, XLogRecPtr, Buffer, and this
is quite useful if you want to check if something is definitely
invalid. I think that's fine in principle, we're already "wasting"
some IDs in the gap between RM_MAX_BUILTIN_ID and RM_MIN_CUSTOM_ID.

In the current xlog infrastructure, we use xl_tot_len as that sentinel
to detect whether a new record may exist, but in this patch that can't
be used because the field may not exist and depends on other bytes. So
I used xl_rmgr_id as the field to base the 'may a next record exist'
checks on, which required the 0 rmgr ID to be invalid.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Michael Paquier

Дата:

27 сентября 2023 г., 02:51:26

On Mon, Sep 25, 2023 at 07:40:00PM +0200, Matthias van de Meent wrote:
> On Wed, 20 Sept 2023 at 07:06, Michael Paquier <michael@paquier.xyz> wrote:
>>  #define COPY_HEADER_FIELD(_dst, _size)            \
>>      do {                                        \
>> -        if (remaining < _size)                    \
>> +        if (remaining < (_size))                \
>>              goto shortdata_err;                    \
>>
>> There are a couple of stylistic changes like this one, that I guess
>> could just use their own patch to make these macros easier to use.
>
> They actually fix complaints of my IDE, but are otherwise indeed stylistic.

Oh, OK.  I just use an old-school terminal, but no objections in
changing these if they make life easier for some hackers.  Still, that
feels independant of what you are proposing here.
--
Michael

Вложения

signature.asc

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Pavel Borisov

Дата:

03 января, 13:15:05

Hi and Happy New Year!

I've looked through the patches and the change seems quite small and justified. But at the second round, some doubt arises on whether this long patchset indeed introduces enough performance gain? I may be wrong, but it saves only several bytes and the performance gain would be only in some specific artificial workload. Did you do some measurements? Do we have several percent performance-wise?

Kind regards,

Pavel Borisov

Re: XLog size reductions: Reduced XLog record header size for PG17

От

"Andrey M. Borodin"

Дата:

07 апреля, 08:37:25


> On 3 Jan 2024, at 15:15, Pavel Borisov <pashkin.elfe@gmail.com> wrote:
>
> Hi and Happy New Year!
>
> I've looked through the patches and the change seems quite small and justified. But at the second round, some doubt
ariseson whether this long patchset indeed introduces enough performance gain? I may be wrong, but it saves only
severalbytes and the performance gain would be only in some specific artificial workload.  Did you do some
measurements?Do we have several percent performance-wise? 
>
> Kind regards,
> Pavel Borisov

Hi Matthias!

This is a kind reminder that the thread is waiting for your reply. Are you interesting in CF entry [0]?

Thanks!


Best regards, Andrey Borodin.
[0] https://commitfest.postgresql.org/47/4386/

Re: XLog size reductions: Reduced XLog record header size for PG17

От

Mark Dilger

Дата:

05 июня, 18:12:47

> On Jun 20, 2023, at 1:01 PM, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>
> 0001 is copied essentially verbatim from [1] and reduces overhead in
> the registered block's length field where possible. It is included to
> improve code commonality between varcoded integer fields. See [1] for
> more details.

Hi Matthias!  I am interested in seeing this patch move forward.  We seem to be stuck.

The disagreement on the other thread seems to be about whether we can generalize and reuse variable integer encoding.
Couldyou comment on whether perhaps we just need a few versions of that?  Perhaps one version where the number of
lengthbytes is encoded in the length itself (such as is used for varlena and by Andres' patch) and one where the number
oflength bytes is stored elsewhere?  You are clearly using the "elsewhere" form, but perhaps you could pull out the
logicof that into src/common?  In struct XLogRecordBlockHeader.id <http://xlogrecordblockheader.id/>, you are reserving
twobits for the size class.  (The code comments aren't clear about this, by the way.)  Perhaps if the generalized
lengthencoding logic could take a couple arguments to represent where and how the size class bits are to be stored, and
wherethe length itself is stored?  I doubt you need to sacrifice any performance gains of this patch to make that
happen. You'd just need to restructure the patch. 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: XLog size reductions: Reduced XLog record header size for PG17

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения