Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

Поиск
Список
Период
Сортировка
От Michail Nikolaev
Тема Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Дата
Msg-id CANtu0oipL3e8fLnejbH4HnByMW6G_auR4v+ns8j-UHhuPW=9og@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Michail Nikolaev <michail.nikolaev@gmail.com>)
Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
Hello!

> I think the best way for this to work would be an index method that
> exclusively stores TIDs, and of which we can quickly determine new
> tuples, too. I was thinking about something like GIN's format, but
> using (generation number, tid) instead of ([colno, colvalue], tid) as
> key data for the internal trees, and would be unlogged (because the
> data wouldn't have to survive a crash)

Yeah, this seems to be a reasonable approach, but there are some
doubts related to it - it needs new index type as well as unlogged
indexes to be introduced - this may make the patch too invasive to be
merged. Also, some way to remove the index from the catalog in case of
a crash may be required.

A few more thoughts:
* it is possible to go without generation number - we may provide a
way to do some kind of fast index lookup (by TID) directly during the
second table scan phase.
* one more option is to maintain a Tuplesorts (instead of an index)
with TIDs as changelog and merge with index snapshot after taking a
new visibility snapshot. But it is not clear how to share the same
Tuplesort with multiple inserting backends.
* crazy idea - what is about to do the scan in the index we are
building? We have tuple, so, we have all the data indexed in the
index. We may try to do an index scan using that data to get all
tuples and find the one with our TID :) Yes, in some cases it may be
too bad because of the huge amount of TIDs we need to scan + also
btree copies whole page despite we need single item. But some
additional index method may help - feels like something related to
uniqueness (but it is only in btree anyway).

Thanks,
Mikhail.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Add bump memory context type and use it for tuplesorts
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: Add lookup table for replication slot invalidation causes