Обсуждение: XLog size reductions: Reduced XLog record header size for PG17
Hi, As I've mentioned earlier on-list [0][1] and off-list, I've been working on reducing the volume of WAL that we write. This is one intermediate step towards that effort. Attached is a patchset that reduces the storage overhead of each WAL record, with theoretical total savings in pgbench transactions somewhere between 4-5%, depending on platform and the locality of updates. These savings are achieved by reducing the amount of data stored, and by not emitting bytes that don't carry data relevant to each record's redo- and decode routines. Patches contained: 0001 is copied essentially verbatim from [1] and reduces overhead in the registered block's length field where possible. It is included to improve code commonality between varcoded integer fields. See [1] for more details. 0002 updates accesses to the rmgr-managed bits of xl_info with its own macro returning only rmgr-managed bits, and updates XLogRecGetInfo() to return only the xlog-managed bits. 0003 renames the rm_identify argument from 'info' to 'rmgrinfo'; and stops providing the xlog-managed bits to the function - rmgrs have no need to know the xlog internal info bits. 0004 continues on 0003 and moves the rmgr info bits into their own xl_rmgrinfo of type uint8, stored in the alignment hole in the XLogRecord struct. 0005 updates the code to only include a valid XID in the record when the rmgr actually needs to use that XID. 0006 implements a new, variable length, WAL record header format, previously discussed at [0] and [2]. This new WAL record header is a minimum of 14 bytes large, but generally will be 15 to 21 bytes in size, depending on the data contained, the type of record, and whether the record needs an XID. Notes: - The patchset includes [1] for its variable-length encoding of uint32, and this is included in the savings calculation. - Not all records include the backend's XID anymore. XLog API users must explicitly request the inclusion of XID in the record with the XLOG_INCLUDE_XID record flag. - XLog records are now always aligned to 8 bytes. This was needed to reduce the complexity of var-coding the record length on 32-bit systems. Savings on 32-bit systems still exist, but can be expected to be less impactful. - XLog length is now varlength encoded. No more records with <255 bytes of data storing 3 0-bytes - the length is now stored in 0, 1, 2 or 4 bytes. - RMGRs now get their own uint8 info/flags field. No more sharing bits with WAL infrastructure in xl_info. The byte is only stored if it is non-0, and otherwise omitted (indicated by flag bits in xl_info). Todo: - Check for any needed documentation / code comments updates - benchmark this Future work: - we could omit XLR_BLOCK_ID_DATA_[LONG,SHORT] if it is the only "block ID" in the record (such as checkpoint records, commit/rollback records, etc.). This would be indicated by a xl_info bit, and this would save 2-5 bytes per applicable record. - This patch inherits [1]'s property in which we can release the BKPBLOCK_HAS_DATA flag bit (its value is already implied by XLR_BLOCKID_SZCLASS), allowing us to use it for something else, like indicating varcoded RelFileLocator/BlockId. Kind regards, Matthias van de Meent Neon, Inc. [0] https://www.postgresql.org/message-id/flat/CAEze2Whf%3DfwAj7rosf6aDM9t%2B7MU1w-bJn28HFWYGkz%2Bics-hg%40mail.gmail.com [1] https://www.postgresql.org/message-id/flat/CAEze2WjuJqVeB6EUZ1z75_ittk54H6Lk7WtwRskEeGtZubr4bQ%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/CA+Tgmoaa9Yc9O-FP4vS_xTKf8Wgy8TzHpjnjN56_ShKE=jrP-Q@mail.gmail.com
Вложения
- v1-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patch
- v1-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patch
- v1-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patch
- v1-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patch
- v1-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patch
- v1-0006-Reformat-the-XLog-record-header.patch
Hi, The attached v2 patchset contains some small fixes for the failing cfbot 32-bit tests - at least locally it does so. I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last patch of the set, which has now been updated to XLP_ALIGN as well. Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of TYPEALIGN so that we don't lose bits of the aligned value in 32-bit systems. Kind regards, Matthias van de Meent Neon (https://neon.tech/)
Вложения
- v2-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patch
- v2-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patch
- v2-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patch
- v2-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patch
- v2-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patch
- v2-0006-Reformat-the-XLog-record-header.patch
On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > Hi, > > The attached v2 patchset contains some small fixes for the failing > cfbot 32-bit tests - at least locally it does so. > > I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last > patch of the set, which has now been updated to XLP_ALIGN as well. > Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of > TYPEALIGN so that we don't lose bits of the aligned value in 32-bit > systems. Apparently there was some usage of MAXALIGN() in xlogreader that I'd missed, and which only shows up in TAP tests. In v3 I've fixed that, together with some improved early detection of invalid record headers. Kind regards, Matthias van de Meent Neon (https://neon.tech/)
Вложения
- v3-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patch
- v3-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patch
- v3-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patch
- v3-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patch
- v3-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patch
- v3-0006-Reformat-the-XLog-record-header.patch
On Mon, 3 Jul 2023 at 13:08, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > > > Hi, > > > > The attached v2 patchset contains some small fixes for the failing > > cfbot 32-bit tests - at least locally it does so. > > > > I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last > > patch of the set, which has now been updated to XLP_ALIGN as well. > > Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of > > TYPEALIGN so that we don't lose bits of the aligned value in 32-bit > > systems. > > Apparently there was some usage of MAXALIGN() in xlogreader that I'd > missed, and which only shows up in TAP tests. In v3 I've fixed that, > together with some improved early detection of invalid record headers. Another fix for CFBot - pg_waldump tests which were added in 96063e28 exposed an issue in my patchset related to RM_INVALID_ID. v4 splits former patch 0006 into two: now 0006 adds RM_INVALID and does the rmgr-related changes in the code, and 0007 does the WAL disk format overhaul. Kind regards, Matthias van de Meent Neon (https://neon.tech/)
Вложения
- v4-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patch
- v4-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patch
- v4-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patch
- v4-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patch
- v4-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patch
- v4-0006-Add-RM_INVALID-an-invalid-resource-manager.patch
- v4-0007-Reformat-the-XLog-record-header.patch
On Wed, 12 Jul 2023 at 14:50, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > On Mon, 3 Jul 2023 at 13:08, Matthias van de Meent > <boekewurm+postgres@gmail.com> wrote: > > > > On Fri, 30 Jun 2023 at 17:36, Matthias van de Meent > > <boekewurm+postgres@gmail.com> wrote: > > > > > > Hi, > > > > > > The attached v2 patchset contains some small fixes for the failing > > > cfbot 32-bit tests - at least locally it does so. > > > > > > I'd overlooked one remaining use of MAXALIGN64() in xlog.c in the last > > > patch of the set, which has now been updated to XLP_ALIGN as well. > > > Additionally, XLP_ALIGN has been updated to use TYPEALIGN64 instead of > > > TYPEALIGN so that we don't lose bits of the aligned value in 32-bit > > > systems. > > > > Apparently there was some usage of MAXALIGN() in xlogreader that I'd > > missed, and which only shows up in TAP tests. In v3 I've fixed that, > > together with some improved early detection of invalid record headers. > > Another fix for CFBot - pg_waldump tests which were added in 96063e28 > exposed an issue in my patchset related to RM_INVALID_ID. > > v4 splits former patch 0006 into two: now 0006 adds RM_INVALID and > does the rmgr-related changes in the code, and 0007 does the WAL disk > format overhaul. V5 is a rebased version of v4, and includes the latest patch from "smaller XLRec block header" [0] as 0001. Kind regards, Matthias van de Meent [0] https://www.postgresql.org/message-id/CAEze2WhG_qvs0_HPCKyGLjFSSeiLZJcFhT%3DrzEUd7AzyxnSfKw%40mail.gmail.com
Вложения
- v5-0003-Rename-rmgr_identify-info-bits-argument-to-rmgrin.patch
- v5-0004-Move-rmgr-info-bits-into-their-own-field-in-the-x.patch
- v5-0002-Replace-accesses-to-rmgr-s-XLogRecord-info-bit-ac.patch
- v5-0001-Reduce-overhead-of-small-block-data-in-xlog-recor.patch
- v5-0005-Stop-the-logging-of-XIDs-in-records-where-the-XID.patch
- v5-0006-Add-RM_INVALID-an-invalid-resource-manager.patch
- v5-0007-Reformat-the-XLog-record-header.patch
On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote: > V5 is a rebased version of v4, and includes the latest patch from > "smaller XLRec block header" [0] as 0001. 0001 and 0007 are the meat of the changes. -#define XLR_CHECK_CONSISTENCY 0x02 +#define XLR_CHECK_CONSISTENCY (0x20) I can't help but notice that there are a few stylistic choices like this one that are part of the patch. Using parenthesis in the case of hexa values is inconsistent with the usual practices I've seen in the tree. #define COPY_HEADER_FIELD(_dst, _size) \ do { \ - if (remaining < _size) \ + if (remaining < (_size)) \ goto shortdata_err; \ There are a couple of stylistic changes like this one, that I guess could just use their own patch to make these macros easier to use. -#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info) +#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & XLR_INFO_MASK) +#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & XLR_RMGR_INFO_MASK) This stuff in 0002 is independent of 0001, am I right? Doing this split with an extra macro is okay by me, reducing the presence of XLR_INFO_MASK and bitwise operations based on it. 0003 is also mechanical, but if you begin to enforce the use of XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR identity callback, we should have at least a validity check to make sure that nothing, even custom RMGRs, pass down unexpected bits? I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and I fear that people are going to forget to set it. Wouldn't it be better to use an option where the XID is excluded instead, making the inclusing the an XID the default? > The resource manager has ID = 0, thus requiring some special > handling in other code. Apart from being generally useful, it is > used in future patches to detect the end of wal in lieu of a zero-ed > fixed-size xl_tot_len field. Err, no, that may not be true. See for example this thread where the topic of improving the checks of xl_tot_len and rely on this value on when a record header has been validated, even across page borders: https://www.postgresql.org/message-id/17928-aa92416a70ff44a2@postgresql.org Except that, in which cases could an invalid RMGR be useful? -- Michael
Вложения
On Wed, 20 Sept 2023 at 07:06, Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Sep 19, 2023 at 12:07:07PM +0200, Matthias van de Meent wrote: > > V5 is a rebased version of v4, and includes the latest patch from > > "smaller XLRec block header" [0] as 0001. > > 0001 and 0007 are the meat of the changes. Correct. > -#define XLR_CHECK_CONSISTENCY 0x02 > +#define XLR_CHECK_CONSISTENCY (0x20) > > I can't help but notice that there are a few stylistic choices like > this one that are part of the patch. Using parenthesis in the case of > hexa values is inconsistent with the usual practices I've seen in the > tree. Yes, I'll take another look at that. > #define COPY_HEADER_FIELD(_dst, _size) \ > do { \ > - if (remaining < _size) \ > + if (remaining < (_size)) \ > goto shortdata_err; \ > > There are a couple of stylistic changes like this one, that I guess > could just use their own patch to make these macros easier to use. They actually fix complaints of my IDE, but are otherwise indeed stylistic. > -#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info) > +#define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info & XLR_INFO_MASK) > +#define XLogRecGetRmgrInfo(decoder) (((decoder)->record->header.xl_info) & XLR_RMGR_INFO_MASK) > > This stuff in 0002 is independent of 0001, am I right? Doing this > split with an extra macro is okay by me, reducing the presence of > XLR_INFO_MASK and bitwise operations based on it. Yes, that change is to stop making use of (~XLR_INFO_MASK) where XLR_RMGR_INFO_MASK is the correct bitmask (whilst also being quite useful in the later patch). > 0003 is also mechanical, but if you begin to enforce the use of > XLR_RMGR_INFO_MASK as the bits allowed to be passed down to the RMGR > identity callback, we should have at least a validity check to make > sure that nothing, even custom RMGRs, pass down unexpected bits? I think that's already handled in XLogInsert(), but I'll make sure to add more checks if they're not in place yet. > I am not convinced that XLOG_INCLUDE_XID is a good interface, TBH, and > I fear that people are going to forget to set it. Wouldn't it be > better to use an option where the XID is excluded instead, making the > inclusing the an XID the default? Most rmgrs don't actually use the XID. Only XACT, MULTIXACT, HEAP, HEAP2, and LOGICALMSG use the xid, so I thought it would be easier to just find the places where those RMGR's records were being logged than to update all other places. I don't mind changing how we decide to log the XID, but I don't think EXCLUDE_XID is a good alternative: most records just don't need the transaction ID. There are many more index AMs with logging than table AMs, so I don't think it is that weird to default to 'not included'. > > The resource manager has ID = 0, thus requiring some special > > handling in other code. Apart from being generally useful, it is > > used in future patches to detect the end of wal in lieu of a zero-ed > > fixed-size xl_tot_len field. > > Err, no, that may not be true. See for example this thread where the > topic of improving the checks of xl_tot_len and rely on this value on > when a record header has been validated, even across page borders: > https://www.postgresql.org/message-id/17928-aa92416a70ff44a2@postgresql.org Yes, there are indeed exceptions when reusing WAL segments, but it's still a good canary, like xl_tot_len before this patch. > Except that, in which cases could an invalid RMGR be useful? A sentinel value that is obviously invalid is available for several types, e.g. BlockNumber, TransactionId, XLogRecPtr, Buffer, and this is quite useful if you want to check if something is definitely invalid. I think that's fine in principle, we're already "wasting" some IDs in the gap between RM_MAX_BUILTIN_ID and RM_MIN_CUSTOM_ID. In the current xlog infrastructure, we use xl_tot_len as that sentinel to detect whether a new record may exist, but in this patch that can't be used because the field may not exist and depends on other bytes. So I used xl_rmgr_id as the field to base the 'may a next record exist' checks on, which required the 0 rmgr ID to be invalid. Kind regards, Matthias van de Meent Neon (https://neon.tech)
On Mon, Sep 25, 2023 at 07:40:00PM +0200, Matthias van de Meent wrote: > On Wed, 20 Sept 2023 at 07:06, Michael Paquier <michael@paquier.xyz> wrote: >> #define COPY_HEADER_FIELD(_dst, _size) \ >> do { \ >> - if (remaining < _size) \ >> + if (remaining < (_size)) \ >> goto shortdata_err; \ >> >> There are a couple of stylistic changes like this one, that I guess >> could just use their own patch to make these macros easier to use. > > They actually fix complaints of my IDE, but are otherwise indeed stylistic. Oh, OK. I just use an old-school terminal, but no objections in changing these if they make life easier for some hackers. Still, that feels independant of what you are proposing here. -- Michael
Вложения
Hi and Happy New Year!
I've looked through the patches and the change seems quite small and justified. But at the second round, some doubt arises on whether this long patchset indeed introduces enough performance gain? I may be wrong, but it saves only several bytes and the performance gain would be only in some specific artificial workload. Did you do some measurements? Do we have several percent performance-wise?
Kind regards,
Pavel Borisov
> On 3 Jan 2024, at 15:15, Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > Hi and Happy New Year! > > I've looked through the patches and the change seems quite small and justified. But at the second round, some doubt ariseson whether this long patchset indeed introduces enough performance gain? I may be wrong, but it saves only severalbytes and the performance gain would be only in some specific artificial workload. Did you do some measurements?Do we have several percent performance-wise? > > Kind regards, > Pavel Borisov Hi Matthias! This is a kind reminder that the thread is waiting for your reply. Are you interesting in CF entry [0]? Thanks! Best regards, Andrey Borodin. [0] https://commitfest.postgresql.org/47/4386/
> On Jun 20, 2023, at 1:01 PM, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > 0001 is copied essentially verbatim from [1] and reduces overhead in > the registered block's length field where possible. It is included to > improve code commonality between varcoded integer fields. See [1] for > more details. Hi Matthias! I am interested in seeing this patch move forward. We seem to be stuck. The disagreement on the other thread seems to be about whether we can generalize and reuse variable integer encoding. Couldyou comment on whether perhaps we just need a few versions of that? Perhaps one version where the number of lengthbytes is encoded in the length itself (such as is used for varlena and by Andres' patch) and one where the number oflength bytes is stored elsewhere? You are clearly using the "elsewhere" form, but perhaps you could pull out the logicof that into src/common? In struct XLogRecordBlockHeader.id <http://xlogrecordblockheader.id/>, you are reserving twobits for the size class. (The code comments aren't clear about this, by the way.) Perhaps if the generalized lengthencoding logic could take a couple arguments to represent where and how the size class bits are to be stored, and wherethe length itself is stored? I doubt you need to sacrifice any performance gains of this patch to make that happen. You'd just need to restructure the patch. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company