Обсуждение: Possible to go without page headers?

Поиск
Список
Период
Сортировка

Possible to go without page headers?

От
Chris Cleveland
Дата:
I'm writing an index access method with its own unique file format. It involves storing large blobs that break across pages.

The file format itself doesn't need or use page headers. There's no need for a checksum or to manage free space within the page.

Can I treat pages as just a flat, open 8k buffer and fill them with arbitrary data?

The reason I ask is that I see some reference to an LSN, used to determine when to dump a dirty buffer to disk, and don't know whether that is actually required. I plan to write a large number of pages all at once and I'm not yet quite sure how WAL logging will work. I also see some suggestion that the vacuum process uses page headers, but I haven't quite figured that out either.

Re: Possible to go without page headers?

От
Tom Lane
Дата:
Chris Cleveland <ccleve+github@dieselpoint.com> writes:
> Can I treat pages as just a flat, open 8k buffer and fill them with
> arbitrary data?

No, at least not unless you plan to reimplement much of the WAL
mechanism.  You do need at least an LSN in the right place.
I kinda doubt that you can get away with ignoring checksumming,
either.  On the whole, I think you'd be best off to use a standard
page header; the amount you're saving by avoiding that will be
minuscule, and the amount of work you cause for yourself probably
not so much.

BTW, there are also tools such as pg_filedump that expect that index
pages can be identified by some sort of magic number kept in the
"special space" at the page tail.  You're not absolutely bound to make
that work, but you'll be cutting yourself off from some potentially
handy support.

            regards, tom lane



Re: Possible to go without page headers?

От
David Steele
Дата:
On 2/14/22 16:19, Tom Lane wrote:
> Chris Cleveland <ccleve+github@dieselpoint.com> writes:
>> Can I treat pages as just a flat, open 8k buffer and fill them with
>> arbitrary data?
> 
> No, at least not unless you plan to reimplement much of the WAL
> mechanism.  You do need at least an LSN in the right place.
> I kinda doubt that you can get away with ignoring checksumming,
> either.  On the whole, I think you'd be best off to use a standard
> page header; the amount you're saving by avoiding that will be
> minuscule, and the amount of work you cause for yourself probably
> not so much.
> 
> BTW, there are also tools such as pg_filedump that expect that index
> pages can be identified by some sort of magic number kept in the
> "special space" at the page tail.  You're not absolutely bound to make
> that work, but you'll be cutting yourself off from some potentially
> handy support.

You'll also get errors from external tools (like pgBackRest) that 
validate checksums and headers.

Regards,
-David



Re: Possible to go without page headers?

От
Peter Geoghegan
Дата:
On Mon, Feb 14, 2022 at 2:19 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> No, at least not unless you plan to reimplement much of the WAL
> mechanism.  You do need at least an LSN in the right place.
> I kinda doubt that you can get away with ignoring checksumming,
> either.  On the whole, I think you'd be best off to use a standard
> page header; the amount you're saving by avoiding that will be
> minuscule, and the amount of work you cause for yourself probably
> not so much.

It isn't actually necessary for an index AM to use the standard
slotted page format to get the benefits that you mention, of course --
whether or not an index AM that uses standard page headers *also* uses
slotted pages with standard line pointers is a separate question. For
example, GIN posting tree pages don't use standard line pointers, but
still have a standard page header (and a generic GIN special area in
the opaque space).

I agree that it's hard to imagine that opting out of using the
standard page header format could ever make much sense. Principally
because the restrictions imposed on an index AM that uses the standard
page header format are very minimal, while the benefits are
substantial.

-- 
Peter Geoghegan



Re: Possible to go without page headers?

От
Tom Lane
Дата:
Peter Geoghegan <pg@bowt.ie> writes:
> It isn't actually necessary for an index AM to use the standard
> slotted page format to get the benefits that you mention, of course --
> whether or not an index AM that uses standard page headers *also* uses
> slotted pages with standard line pointers is a separate question.

Right, you don't need to use a line pointer array if you don't want
to.  (IIRC, hash also opts out of that in some pages.)  I took the
question to be just about the page header proper.

            regards, tom lane