Re: MaxOffsetNumber for Table AMs

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: MaxOffsetNumber for Table AMs
Дата
Msg-id CA+TgmoZ0S5zU4OpBxQvJ_ifu1LDcvc1z6i=XAXnnM29GvB6Hfw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: MaxOffsetNumber for Table AMs  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: MaxOffsetNumber for Table AMs  (Hannu Krosing <hannuk@google.com>)
Список pgsql-hackers
On Wed, May 5, 2021 at 3:43 PM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
> I believe that it cannot be "just" an additive thing, at least not
> through a normal INCLUDEd column, as you'd get duplicate TIDs in the
> index, with its related problems. You also cannot add it as a key
> column, as this would disable UNIQUE indexes; one of the largest use
> cases of global indexes. So, you must create specialized
> infrastructure for this identifier.
>
> And when we're already adding specialized infrastructure, then this
> should probably be part of a new TID infrastructure.
>
> And if we're going to change TID infrastructure to allow for more
> sizes (as we'd need normal TableAM TIDs, and global index
> partition-identifying TIDs), I'd argue that it should not be too much
> more difficult to create an infrastructure for 'new TID' in which the
> table AM supplies type, size and strict ordering information for these
> 'new TID's.
>
> And if this 'new TID' size is not going to be defined by the index AM
> but by the indexed object (be it a table or a 'global' or whatever
> we'll build indexes on), I see no reason why this 'new TID'
> infrastructure couldn't eventually support variable length TIDs; or
> constant sized usertype TIDs (e.g. the 3 int columns of the primary
> key of a clustered table).
>
> The only requirements that I believe to be fundamental for any kind of TID are
>
> 1.) Uniqueness during the lifecycle of the tuple, from creation to
> life to dead to fully dereferenced from all indexes;
> 2.) There exists a strict ordering of all TIDs of that type;
>
> And maybe to supply some form of efficiency to the underlying tableAM:
>
> 3.) There should be an equivalent of bitmap for that TID type.
>
> For the nbtree deduplication subsystem, and for gin posting lists to
> be able to work efficiently, the following must also hold:
>
> 4.) The TID type has a fixed size, preferably efficiently packable.
>
> Only the last requirement cannot be met with varlena TID types. But,
> as I also believe that not all indexes can be expected to work (well)
> for all kinds of TableAM, I don't see how this would be a blocking
> issue.

+1 to all of that.

> Storage gains for index-oriented tables can become as large as the
> size of the primary key by not having to store all primary key values
> in both the index and the table; which can thus be around 100% of a
> table in the least efficient cases of having a PK over all columns.
>
> Yes, this might be indeed only a 'small gain' for access latency, but
> not needing to store another copy of your data (and keeping it in
> cache, etc.) is a significant win in my book.

This is a really good point. Also, if the table is ordered by a
synthetic logical TID, range scans on the primary key will be less
efficient than if the primary key is itself the TID. We have the
ability to CLUSTER on an index for good reasons, and "Automatically
maintain clustering on a table" has been on the todo list forever.
It's hard to imagine this will ever be achieved with the current heap,
though: the way to get there is to have a table AM for which this is
an explicit goal.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Masahiko Sawada
Дата:
Сообщение: Re: Replication slot stats misgivings
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Dubious assertion in RegisterDynamicBackgroundWorker