Обсуждение: bgwriter doesn't flush WAL stats
Hi,
Example output before patch applied:
┌─────────────┬─────────────────┐
│ view_name │ total_wal_write │
├─────────────┼─────────────────┤
│ pg_stat_wal │ 10318 │
│ pg_stat_io │ 10321 │
└─────────────┴─────────────────┘
┌─────────────────────┬────────┬────────┐
│ backend_type │ object │ writes │
├─────────────────────┼────────┼────────┤
│ autovacuum launcher │ wal │ 0 │
│ autovacuum worker │ wal │ 691 │
│ client backend │ wal │ 8170 │
│ background worker │ wal │ 0 │
│ background writer │ wal │ 3 │
│ checkpointer │ wal │ 1 │
│ standalone backend │ wal │ 737 │
│ startup │ wal │ 0 │
│ walsender │ wal │ 0 │
│ walwriter │ wal │ 719 │
I was trying to add WAL stats to pg_stat_io. While doing that I was comparing pg_stat_wal and pg_stat_io's WAL stats and there was some inequality between the total number of WALs. I found that the difference comes from bgwriter's WALs. bgwriter generates WAL but it doesn't flush them because the pgstat_report_wal() function isn't called in bgwriter. I attached a small patch for calling the pgstat_report_wal() function in bgwriter.
bgwriter generates WAL by calling functions in this order:
bgwriter.c -> BackgroundWriterMain() -> BgBufferSync() -> SyncOneBuffer() -> FlushBuffer() -> XLogFlush() -> XLogWrite()
bgwriter.c -> BackgroundWriterMain() -> BgBufferSync() -> SyncOneBuffer() -> FlushBuffer() -> XLogFlush() -> XLogWrite()
I used a query like BEGIN; followed by lots of(3000 in my case) INSERT, DELETE, or UPDATE, followed by a COMMIT while testing.
┌─────────────┬─────────────────┐
│ view_name │ total_wal_write │
├─────────────┼─────────────────┤
│ pg_stat_wal │ 10318 │
│ pg_stat_io │ 10321 │
└─────────────┴─────────────────┘
┌─────────────────────┬────────┬────────┐
│ backend_type │ object │ writes │
├─────────────────────┼────────┼────────┤
│ autovacuum launcher │ wal │ 0 │
│ autovacuum worker │ wal │ 691 │
│ client backend │ wal │ 8170 │
│ background worker │ wal │ 0 │
│ background writer │ wal │ 3 │
│ checkpointer │ wal │ 1 │
│ standalone backend │ wal │ 737 │
│ startup │ wal │ 0 │
│ walsender │ wal │ 0 │
│ walwriter │ wal │ 719 │
└─────────────────────┴────────┴────────┘
After the patch has been applied, there are no differences between pg_stat_wal and pg_stat_io.
I appreciate any comment/feedback on this patch.
Regards,
Nazir Bilal Yavuz
Microsoft
Regards,
Nazir Bilal Yavuz
Microsoft
Вложения
On Wed, 21 Jun 2023 at 13:04, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote: > > Hi, > > I was trying to add WAL stats to pg_stat_io. While doing that I was comparing pg_stat_wal and pg_stat_io's WAL stats andthere was some inequality between the total number of WALs. I found that the difference comes from bgwriter's WALs. bgwritergenerates WAL but it doesn't flush them because the pgstat_report_wal() function isn't called in bgwriter. I attacheda small patch for calling the pgstat_report_wal() function in bgwriter. > > bgwriter generates WAL by calling functions in this order: > bgwriter.c -> BackgroundWriterMain() -> BgBufferSync() -> SyncOneBuffer() -> FlushBuffer() -> XLogFlush() -> XLogWrite() I was quite confused here, as XLogWrite() does not generate any WAL; it only writes existing WAL from buffers to disk. In a running PostgreSQL instance, WAL is only generated through XLogInsert(xloginsert.c) and serialized / written to buffers in its call to XLogInsertRecord(xlog.c); XLogFlush and XLogWrite are only responsible for writing those buffers to disk. The only path that I see in XLogWrite() that could potentially put anything into WAL is through RequestCheckpoint(), but that only writes out a checkpoint when it is not in a postmaster environment - in all other cases it will wake up the checkpointer and wait for that checkpoint to finish. I also got confused with your included views; they're not included in the patch and the current master branch doesn't emit object=wal, so I can't really check that the patch works as intended. But on the topic of reporting the WAL stats in bgwriter; that seems like a good idea to fix, yes. +1 Kind regards, Matthias van de Meent Neon, Inc.
Hi,
Thanks for the explanation.
On Wed, 21 Jun 2023 at 18:03, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>
> On Wed, 21 Jun 2023 at 13:04, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> > I was trying to add WAL stats to pg_stat_io. While doing that I was comparing pg_stat_wal and pg_stat_io's WAL stats and there was some inequality between the total number of WALs. I found that the difference comes from bgwriter's WALs. bgwriter generates WAL but it doesn't flush them because the pgstat_report_wal() function isn't called in bgwriter. I attached a small patch for calling the pgstat_report_wal() function in bgwriter.
> >
> > bgwriter generates WAL by calling functions in this order:
> > bgwriter.c -> BackgroundWriterMain() -> BgBufferSync() -> SyncOneBuffer() -> FlushBuffer() -> XLogFlush() -> XLogWrite()
>
> I was quite confused here, as XLogWrite() does not generate any WAL;
> it only writes existing WAL from buffers to disk.
> In a running PostgreSQL instance, WAL is only generated through
> XLogInsert(xloginsert.c) and serialized / written to buffers in its
> call to XLogInsertRecord(xlog.c); XLogFlush and XLogWrite are only
> responsible for writing those buffers to disk.
Yes, you are right. Correct explanation should be "bgwriter writes existing WAL from buffers to disk but pg_stat_wal doesn't count them because bgwriter doesn't call pgstat_report_wal() to update WAL statistics".
> I also got confused with your included views; they're not included in
> the patch and the current master branch doesn't emit object=wal, so I
> can't really check that the patch works as intended.
I attached a WIP patch for showing WAL stats in pg_stat_io.
Thanks for the explanation.
On Wed, 21 Jun 2023 at 18:03, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>
> On Wed, 21 Jun 2023 at 13:04, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote:
> > I was trying to add WAL stats to pg_stat_io. While doing that I was comparing pg_stat_wal and pg_stat_io's WAL stats and there was some inequality between the total number of WALs. I found that the difference comes from bgwriter's WALs. bgwriter generates WAL but it doesn't flush them because the pgstat_report_wal() function isn't called in bgwriter. I attached a small patch for calling the pgstat_report_wal() function in bgwriter.
> >
> > bgwriter generates WAL by calling functions in this order:
> > bgwriter.c -> BackgroundWriterMain() -> BgBufferSync() -> SyncOneBuffer() -> FlushBuffer() -> XLogFlush() -> XLogWrite()
>
> I was quite confused here, as XLogWrite() does not generate any WAL;
> it only writes existing WAL from buffers to disk.
> In a running PostgreSQL instance, WAL is only generated through
> XLogInsert(xloginsert.c) and serialized / written to buffers in its
> call to XLogInsertRecord(xlog.c); XLogFlush and XLogWrite are only
> responsible for writing those buffers to disk.
Yes, you are right. Correct explanation should be "bgwriter writes existing WAL from buffers to disk but pg_stat_wal doesn't count them because bgwriter doesn't call pgstat_report_wal() to update WAL statistics".
> I also got confused with your included views; they're not included in
> the patch and the current master branch doesn't emit object=wal, so I
> can't really check that the patch works as intended.
I attached a WIP patch for showing WAL stats in pg_stat_io.
After applying patch, I used these queries for the getting views I shared in the first mail;
Query for the first view:SELECT
'pg_stat_wal' AS view_name,
SUM(wal_write) AS total_wal_write
FROM
pg_stat_wal
UNION ALL
SELECT
'pg_stat_io' AS view_name,
SUM(writes) AS total_wal_write
FROM
pg_stat_io
WHERE
object = 'wal';
SELECT backend_type, object, writes FROM pg_stat_io where object = 'wal';
I also changed the description on the patch file and attached it.
Regards,
Nazir Bilal Yavuz
Microsoft
Вложения
At Wed, 21 Jun 2023 18:52:26 +0300, Nazir Bilal Yavuz <byavuz81@gmail.com> wrote in > I attached a WIP patch for showing WAL stats in pg_stat_io. Yeah, your diagnosis appears accurate. I managed to trigger an assertion failure quite easily when I added "Assert(!pgstat_have_pending_wal()) just after the call to pgstat_report_bgwriter(). Good find! I slightly inclined to place the added call after smgrcloseall() but it doesn't seem to cause any io-stats updates so the proposed first patch as-is looks good to me. Regarding the second patch, it introduces WAL IO time as a IOCONTEXT_NORMAL/IOOBJECT_WAL, but it doesn't seem to follow the convention or design of the pgstat_io component, which primarily focuses on shared buffer IOs. There was a brief mention about WAL IO during the development of pgstat_io [1]. >> It'd be different if we tracked WAL fsyncs more granularly - which would be >> quite interesting - but that's something for another day^Wpatch. >> >> > I do have a question about this. > So, if we were to start tracking WAL IO would it fit within this > paradigm to have a new IOPATH_WAL for WAL or would it add a separate > dimension? [1] https://www.postgresql.org/message-id/CAAKRu_bM55pj3pPRW0nd_-paWHLRkOU69r816AeztBBa-N1HLA%40mail.gmail.com regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Jun 21, 2023 at 9:49 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > Regarding the second patch, it introduces WAL IO time as a > IOCONTEXT_NORMAL/IOOBJECT_WAL, but it doesn't seem to follow the > convention or design of the pgstat_io component, which primarily > focuses on shared buffer IOs. I haven't reviewed the patch yet, but in my opinion having an IOOBJECT_WAL makes sense. I imagined that we would add WAL as an IOObject along with others such as an IOOBJECT_BYPASS for "bypass" IO (IO done through the smgr API directly) and an IOOBJECT_SPILL or something like it for spill files from joins/aggregates/etc. > > I do have a question about this. > > So, if we were to start tracking WAL IO would it fit within this > > paradigm to have a new IOPATH_WAL for WAL or would it add a separate > > dimension? Personally, I think WAL fits well as an IOObject. Then we can add IOCONTEXT_INIT and use that for WAL file initialization and IOCONTEXT_NORMAL for normal WAL writes/fysncs/etc. I don't think we need a new dimension for it as it feels like an IO target just like shared buffers and temporary buffers do. I think we should save adding new dimensions for relationships that we can't express in the existing paradigm. - Melanie
Hi,
Created a commitfest entry for this.
Link: https://commitfest.postgresql.org/43/4405/
Regards,
Nazir Bilal Yavuz
Microsoft
Created a commitfest entry for this.
Link: https://commitfest.postgresql.org/43/4405/
Regards,
Nazir Bilal Yavuz
Microsoft
On Thu, 22 Jun 2023 at 17:03, Melanie Plageman <melanieplageman@gmail.com> wrote:
On Wed, Jun 21, 2023 at 9:49 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> Regarding the second patch, it introduces WAL IO time as a
> IOCONTEXT_NORMAL/IOOBJECT_WAL, but it doesn't seem to follow the
> convention or design of the pgstat_io component, which primarily
> focuses on shared buffer IOs.
I haven't reviewed the patch yet, but in my opinion having an
IOOBJECT_WAL makes sense. I imagined that we would add WAL as an
IOObject along with others such as an IOOBJECT_BYPASS for "bypass" IO
(IO done through the smgr API directly) and an IOOBJECT_SPILL or
something like it for spill files from joins/aggregates/etc.
> > I do have a question about this.
> > So, if we were to start tracking WAL IO would it fit within this
> > paradigm to have a new IOPATH_WAL for WAL or would it add a separate
> > dimension?
Personally, I think WAL fits well as an IOObject. Then we can add
IOCONTEXT_INIT and use that for WAL file initialization and
IOCONTEXT_NORMAL for normal WAL writes/fysncs/etc. I don't think we
need a new dimension for it as it feels like an IO target just like
shared buffers and temporary buffers do. I think we should save adding
new dimensions for relationships that we can't express in the existing
paradigm.
- Melanie
The first patch, to flush the bgwriter's WAL stats to the stats collector, seems like a straightforward bug fix, so committed and backpatched that. Thank you! I didn't look at the second patch. -- Heikki Linnakangas Neon (https://neon.tech)
Hi, On Mon, 2 Oct 2023 at 13:08, Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > The first patch, to flush the bgwriter's WAL stats to the stats > collector, seems like a straightforward bug fix, so committed and > backpatched that. Thank you! > > I didn't look at the second patch. Thanks for the push! Actual commitfest entry for the second patch is: https://commitfest.postgresql.org/45/4416/. I sent a second patch to this thread just to show how I found this bug. There is no need to review it, this commitfest entry could be closed as committed. Regards, Nazir Bilal Yavuz Microsoft