Обсуждение: Why does execReplication.c lock tuples?

Поиск

Список

Период

Сортировка

Why does execReplication.c lock tuples?

От

Andres Freund

Дата:

20 января 2019 г., 23:03:02

Hi,

Currently RelationFindReplTupleByIndex(), RelationFindReplTupleSeq()
lock the found tuple. I don't quite get what that achieves - why isn't
dealing with concurrency in the table_update/delete calls at the
callsites sufficient? As far as I can tell there's no meaningful
concurrency handling in the heap_lock_tuple() paths, so it's not like we
follow update chains or anything. As far as I can tell this just nearly
doubles the number of WAL records when replaying updates/deletes?

Greetings,

Andres Freund

Re: Why does execReplication.c lock tuples?

От

Petr Jelinek

Дата:

29 января 2019 г., 15:19:52

Hi,

On 20/01/2019 21:03, Andres Freund wrote:
> Hi,
> 
> Currently RelationFindReplTupleByIndex(), RelationFindReplTupleSeq()
> lock the found tuple. I don't quite get what that achieves - why isn't
> dealing with concurrency in the table_update/delete calls at the
> callsites sufficient? As far as I can tell there's no meaningful
> concurrency handling in the heap_lock_tuple() paths, so it's not like we
> follow update chains or anything. 
> 

Yeah that's leftover from the conflict detection/handling code that I
stripped away to keep the patched manageable size-wise. As things stand
now we could remove that and use normal heap_update instead of simple
variant. It'll be likely be needed again if we add conflict handling in
the future, but perhaps we could be smarter about it then (i.e. I can
imagine that it will be per table anyway, not necessarily default behavior).

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: Why does execReplication.c lock tuples?

От

Andres Freund

Дата:

29 января 2019 г., 18:28:28

Hi,

On January 29, 2019 4:19:52 AM PST, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
>Hi,
>
>On 20/01/2019 21:03, Andres Freund wrote:
>> Hi,
>>
>> Currently RelationFindReplTupleByIndex(), RelationFindReplTupleSeq()
>> lock the found tuple. I don't quite get what that achieves - why
>isn't
>> dealing with concurrency in the table_update/delete calls at the
>> callsites sufficient? As far as I can tell there's no meaningful
>> concurrency handling in the heap_lock_tuple() paths, so it's not like
>we
>> follow update chains or anything.
>>
>
>Yeah that's leftover from the conflict detection/handling code that I
>stripped away to keep the patched manageable size-wise. As things stand
>now we could remove that and use normal heap_update instead of simple
>variant. It'll be likely be needed again if we add conflict handling in
>the future, but perhaps we could be smarter about it then (i.e. I can
>imagine that it will be per table anyway, not necessarily default
>behavior).

Why does conflict handling need the unconditional lock? Wouldn't just doing that after an initial heap_update returned
HeapTupleUpdatedmake more sense?  And wouldn't it need to reckeck the row afterwards as well? 

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Why does execReplication.c lock tuples?

От

Petr Jelinek

Дата:

29 января 2019 г., 19:04:55

On 29/01/2019 16:28, Andres Freund wrote:
> Hi,
> 
> On January 29, 2019 4:19:52 AM PST, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
>> Hi,
>>
>> On 20/01/2019 21:03, Andres Freund wrote:
>>> Hi,
>>>
>>> Currently RelationFindReplTupleByIndex(), RelationFindReplTupleSeq()
>>> lock the found tuple. I don't quite get what that achieves - why
>> isn't
>>> dealing with concurrency in the table_update/delete calls at the
>>> callsites sufficient? As far as I can tell there's no meaningful
>>> concurrency handling in the heap_lock_tuple() paths, so it's not like
>> we
>>> follow update chains or anything. 
>>>
>>
>> Yeah that's leftover from the conflict detection/handling code that I
>> stripped away to keep the patched manageable size-wise. As things stand
>> now we could remove that and use normal heap_update instead of simple
>> variant. It'll be likely be needed again if we add conflict handling in
>> the future, but perhaps we could be smarter about it then (i.e. I can
>> imagine that it will be per table anyway, not necessarily default
>> behavior).
> 
> Why does conflict handling need the unconditional lock? Wouldn't just doing that after an initial heap_update
returnedHeapTupleUpdated make more sense?  And wouldn't it need to reckeck the row afterwards as well?
 
> 

To prevent tuple changing from under the conflict resolution logic - we
need to make sure that whatever tuple the conflict resolution logic gets
is the current one. If the tuple is locked you won't get the
HeapTupleUpdated in heap_update anymore (which is why we can use
simple_heap_update).

In any case we don't have conflict resolution at the moment so it's
probably moot right now.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: Why does execReplication.c lock tuples?

От

Andres Freund

Дата:

29 января 2019 г., 19:14:32

Hi

On January 29, 2019 8:04:55 AM PST, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
>On 29/01/2019 16:28, Andres Freund wrote:
>> Hi,
>>
>> On January 29, 2019 4:19:52 AM PST, Petr Jelinek
><petr.jelinek@2ndquadrant.com> wrote:
>>> Hi,
>>>
>>> On 20/01/2019 21:03, Andres Freund wrote:
>>>> Hi,
>>>>
>>>> Currently RelationFindReplTupleByIndex(),
>RelationFindReplTupleSeq()
>>>> lock the found tuple. I don't quite get what that achieves - why
>>> isn't
>>>> dealing with concurrency in the table_update/delete calls at the
>>>> callsites sufficient? As far as I can tell there's no meaningful
>>>> concurrency handling in the heap_lock_tuple() paths, so it's not
>like
>>> we
>>>> follow update chains or anything.
>>>>
>>>
>>> Yeah that's leftover from the conflict detection/handling code that
>I
>>> stripped away to keep the patched manageable size-wise. As things
>stand
>>> now we could remove that and use normal heap_update instead of
>simple
>>> variant. It'll be likely be needed again if we add conflict handling
>in
>>> the future, but perhaps we could be smarter about it then (i.e. I
>can
>>> imagine that it will be per table anyway, not necessarily default
>>> behavior).
>>
>> Why does conflict handling need the unconditional lock? Wouldn't just
>doing that after an initial heap_update returned HeapTupleUpdated make
>more sense?  And wouldn't it need to reckeck the row afterwards as
>well?
>>
>
>To prevent tuple changing from under the conflict resolution logic - we
>need to make sure that whatever tuple the conflict resolution logic
>gets
>is the current one. If the tuple is locked you won't get the
>HeapTupleUpdated in heap_update anymore (which is why we can use
>simple_heap_update).
>
>In any case we don't have conflict resolution at the moment so it's
>probably moot right now.

Not sure why that's relevant - what you'd do is to attempt the update, if it succeeds: great. If not, then you'd do the
locktuple and after that do the conflict resolution. Less WAL and cpu for the common case, same WAL as now for the
conflict,with just a small bit of additional CPU costs for the latter. 

Either way, right now it's definitely superfluous...

Andres
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Re: Why does execReplication.c lock tuples?

От

Petr Jelinek

Дата:

29 января 2019 г., 19:22:43

On 29/01/2019 17:14, Andres Freund wrote:
> Hi
> 
> On January 29, 2019 8:04:55 AM PST, Petr Jelinek <petr.jelinek@2ndquadrant.com> wrote:
>> On 29/01/2019 16:28, Andres Freund wrote:
>>> Hi,
>>>
>>> On January 29, 2019 4:19:52 AM PST, Petr Jelinek
>> <petr.jelinek@2ndquadrant.com> wrote:
>>>> Hi,
>>>>
>>>> On 20/01/2019 21:03, Andres Freund wrote:
>>>>> Hi,
>>>>>
>>>>> Currently RelationFindReplTupleByIndex(),
>> RelationFindReplTupleSeq()
>>>>> lock the found tuple. I don't quite get what that achieves - why
>>>> isn't
>>>>> dealing with concurrency in the table_update/delete calls at the
>>>>> callsites sufficient? As far as I can tell there's no meaningful
>>>>> concurrency handling in the heap_lock_tuple() paths, so it's not
>> like
>>>> we
>>>>> follow update chains or anything. 
>>>>>
>>>>
>>>> Yeah that's leftover from the conflict detection/handling code that
>> I
>>>> stripped away to keep the patched manageable size-wise. As things
>> stand
>>>> now we could remove that and use normal heap_update instead of
>> simple
>>>> variant. It'll be likely be needed again if we add conflict handling
>> in
>>>> the future, but perhaps we could be smarter about it then (i.e. I
>> can
>>>> imagine that it will be per table anyway, not necessarily default
>>>> behavior).
>>>
>>> Why does conflict handling need the unconditional lock? Wouldn't just
>> doing that after an initial heap_update returned HeapTupleUpdated make
>> more sense?  And wouldn't it need to reckeck the row afterwards as
>> well?
>>>
>>
>> To prevent tuple changing from under the conflict resolution logic - we
>> need to make sure that whatever tuple the conflict resolution logic
>> gets
>> is the current one. If the tuple is locked you won't get the
>> HeapTupleUpdated in heap_update anymore (which is why we can use
>> simple_heap_update).
>>
>> In any case we don't have conflict resolution at the moment so it's
>> probably moot right now.
> 
> Not sure why that's relevant - what you'd do is to attempt the update, if it succeeds: great. If not, then you'd do
thelock tuple and after that do the conflict resolution. Less WAL and cpu for the common case, same WAL as now for the
conflict,with just a small bit of additional CPU costs for the latter.
 
> 

But the conflict is not about local node concurrency, rather multi-node
one (ie, merging data from multiple nodes) so you can't depend on
heap_update return value alone. You usually need to figure out if there
is conflict before you even do update (based on tuple metadata/data)
which is why the lock is necessary.

> Either way, right now it's definitely superfluous...
> 

Agreed.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Why does execReplication.c lock tuples?

Why does execReplication.c lock tuples?

Re: Why does execReplication.c lock tuples?

Re: Why does execReplication.c lock tuples?

Re: Why does execReplication.c lock tuples?

Re: Why does execReplication.c lock tuples?

Re: Why does execReplication.c lock tuples?