Re: Support for REINDEX CONCURRENTLY

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: Support for REINDEX CONCURRENTLY
Дата
Msg-id CAB7nPqTWX3T8y4gR9K+BWfQT-o-uJTD0TrVV-O0=NanR7G=DpA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Support for REINDEX CONCURRENTLY  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: Support for REINDEX CONCURRENTLY  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-hackers
On Tue, Aug 27, 2013 at 11:09 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-08-27 15:34:22 +0900, Michael Paquier wrote:
>> I have been working a little bit more on this patch for the next
>> commit fest. Compared to the previous version, I have removed the part
>> of the code where process running REINDEX CONCURRENTLY was waiting for
>> transactions holding a snapshot older than the snapshot xmin of
>> process running REINDEX CONCURRENTLY at the validation and swap phase.
>> At the validation phase, there was a risk that the newly-validated
>> index might not contain deleted tuples before the snapshot used for
>> validation was taken. I tried to break the code in this area by
>> playing with multiple sessions but couldn't. Feel free to try the code
>> and break it if you can!
>
> Hm. Do you have any justifications for removing those waits besides "I
> couldn't break it"? The logic for the concurrent indexing is pretty
> intricate and we've got it wrong a couple of times without noticing bugs
> for a long while, so I am really uncomfortable with statements like this.
Note that the waits on relation locks are not removed, only the wait
phases involving old snapshots.

During swap phase, process was waiting for transactions with older
snapshots than the one taken by transaction doing the swap as they
might hold the old index information. I think that we can get rid of
it thanks to the MVCC snapshots as other backends are now able to see
what is the correct index information to fetch.
After doing the new index validation, index has all the tuples
necessary, however it might not have taken into account tuples that
have been deleted before the reference snapshot was taken. But, in the
case of REINDEX CONCURRENTLY the index validated is not marked as
valid as it is the case in CREATE INDEX CONCURRENTLY, the transaction
doing the validation is directly committed. This index is thought as
valid only after doing the swap phase, when relfilenodes are changed.

I am sure you will find some flaws in this reasoning though :). Of
course not having being able to break this code now with my picky
tests taking targeted breakpoints does not mean that this code will
not fail in a given scenario, just that I could not break it yet.

Note also that removing those wait phases has the advantage to remove
risks of deadlocks when an ANALYZE is run in parallel to REINDEX
CONCURRENTLY as it was the case in the previous versions of the patch
(reproducible when waiting for the old snapshots if a session takes
ShareUpdateExclusiveLock on the same relation in parallel).
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Noah Misch
Дата:
Сообщение: Re: Clarification on materialized view restriction needed
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Clarification on materialized view restriction needed