On 31.05.2013 06:02, Robert Haas wrote:
> On Thu, May 30, 2013 at 2:39 PM, Robert Haas<robertmhaas@gmail.com> wrote:
>> Random thought: Could you compute the reference XID based on the page
>> LSN? That would eliminate the storage overhead.
>
> After mulling this over a bit, I think this is definitely possible.
> We begin a new "half-epoch" every 2 billion transactions. We remember
> the LSN at which the current half-epoch began and the LSN at which the
> previous half-epoch began. When a new half-epoch begins, the first
> backend that wants to stamp a tuple with an XID from the new
> half-epoch must first emit a "new half-epoch" WAL record, which
> becomes the starting LSN for the new half-epoch.
Clever! Pages in unlogged tables need some extra treatment, as they
don't normally have a valid LSN, but that shouldn't be too hard.
> We define a new page-level bit, something like PD_RECENTLY_FROZEN.
> When this bit is set, it means there are no unfrozen tuples on the
> page with XIDs that predate the current half-epoch. Whenever we know
> this to be true, we set the bit. If the page LSN crosses more than
> one half-epoch boundary at a time, we freeze the page and set the bit.
> If the page LSN crosses exactly one half-epoch boundary, then (1) if
> the bit is set, we clear it and (2) if the bit is not set, we freeze
> the page and set the bit.
Yep, I think that would work. Want to write the patch, or should I? ;-)
- Heikki