Re: MaxOffsetNumber for Table AMs

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: MaxOffsetNumber for Table AMs
Дата
Msg-id CA+TgmoZPiH3b3HtSiOvDn8S4tSknmyp8F6oWqbSzFJSv2zXGqA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: MaxOffsetNumber for Table AMs  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: MaxOffsetNumber for Table AMs  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
On Fri, Apr 30, 2021 at 5:22 PM Peter Geoghegan <pg@bowt.ie> wrote:
> I strongly suspect that index-organized tables (or indirect indexes,
> or anything else that assumes that TID-like identifiers map directly
> to logical rows as opposed to physical versions) are going to break
> too many assumptions to ever be tractable. Assuming I have that right,
> it would advance the discussion if we could all agree on that being a
> non-goal for the tableam interface in general.

I *emphatically* disagree with the idea of ruling such things out
categorically. This is just as naive as the TODO's statement that we
do not want "All backends running as threads in a single process".
Does anyone really believe that we don't want that any more? I
believed it 10 years ago, but not any more. It's costing us very
substantially not only in that in makes parallel query more
complicated and fragile, but more importantly in that we can't scale
up to connection counts that other databases can handle because we use
up too many operating system resources. Support threading in
PostgreSQL isn't a project that someone will pull off over a long
weekend and it's not something that has to be done tomorrow, but it's
pretty clearly the future.

So here. The complexity of getting a table AM that does anything
non-trivial working is formidable, and I don't expect it to happen
right away. Picking one that is essentially block-based and can use
48-bit TIDs is very likely the right initial target because that's the
closest we have now, and there's no sense attacking the hardest
variant of the problem first. However, as with the
threads-vs-processes example, I strongly suspect that having only one
table AM is leaving vast amounts of performance on the table. To say
that we're never going to pursue the parts of that space that require
a different kind of tuple identifier is to permanently write off tons
of ideas that have produced promising results in other systems. Let's
not do that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Matthias van de Meent
Дата:
Сообщение: Re: Lowering the ever-growing heap->pd_lower
Следующее
От: Tom Lane
Дата:
Сообщение: Re: strange error reporting