Re: Improve WALRead() to suck data directly from WAL buffers when possible

Поиск
Список
Период
Сортировка
От Andres Freund
Тема Re: Improve WALRead() to suck data directly from WAL buffers when possible
Дата
Msg-id 20240212233648.4ephivkh7ukkiv2k@awork3.anarazel.de
обсуждение исходный текст
Ответ на Re: Improve WALRead() to suck data directly from WAL buffers when possible  (Jeff Davis <pgsql@j-davis.com>)
Ответы Re: Improve WALRead() to suck data directly from WAL buffers when possible  (Jeff Davis <pgsql@j-davis.com>)
Re: Improve WALRead() to suck data directly from WAL buffers when possible  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Список pgsql-hackers
Hi,

On 2024-02-12 12:46:00 -0800, Jeff Davis wrote:
> On Mon, 2024-02-12 at 12:18 -0800, Andres Freund wrote:
> > +    upto = Min(startptr + count, LogwrtResult.Write);
> > +    nbytes = upto - startptr;
> >
> > Shouldn't it pretty much be a bug to ever encounter this?
>
> In the current code it's impossible, though Bharath hinted at an
> extension which could reach that path.
>
> What I committed was a bit of a compromise -- earlier versions of the
> patch supported reading right up to the Insert pointer (which requires
> a call to WaitXLogInsertionsToFinish()). I wasn't ready to commit that
> code without seeing a more about how that would be used, but I thought
> it was reasonable to have some simple code in there to allow reading up
> to the Write pointer.

I doubt there's a sane way to use WALRead() without *first* ensuring that the
range of data is valid. I think we're better of moving that responsibility
explicitly to the caller and adding an assertion verifying that.


> It seems closer to the structure that we will ultimately need to
> replicate unflushed data, right?

It doesn't really seem like a necessary, or even particularly useful,
part. You couldn't just call WALRead() for that, since the caller would need
to know the range up to which WAL is valid but not yet flushed as well. Thus
the caller would need to first use WaitXLogInsertionsToFinish() or something
like it anyway - and then there's no point in doing the WALRead() anymore.

Note that for replicating unflushed data, we *still* might need to fall back
to reading WAL data from disk. In which case not asserting in WALRead() would
just make it hard to find bugs, because not using WaitXLogInsertionsToFinish()
would appear to work as long as data is in wal buffers, but as soon as we'd
fall back to on-disk (but unflushed) data, we'd send bogus WAL.

Greetings,

Andres Freund



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Matthias van de Meent
Дата:
Сообщение: Re: Reducing output size of nodeToString
Следующее
От: Andres Freund
Дата:
Сообщение: Re: glibc qsort() vulnerability