Re: Conflict Detection and Resolution

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Conflict Detection and Resolution
Дата
Msg-id CAA4eK1JwF2rrh2TsDjo2=Qn981MP18aK5470MXTDdWe_0fsVUg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Conflict Detection and Resolution  (shveta malik <shveta.malik@gmail.com>)
Ответы Re: Conflict Detection and Resolution
Re: Conflict Detection and Resolution
Список pgsql-hackers
On Mon, May 27, 2024 at 11:19 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sat, May 25, 2024 at 2:39 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> > >
> > > Conflict Resolution
> > > ----------------
> > > a) latest_timestamp_wins:    The change with later commit timestamp wins.
> > > b) earliest_timestamp_wins:   The change with earlier commit timestamp wins.

Can you share the use case of "earliest_timestamp_wins" resolution
method? It seems after the initial update on the local node, it will
never allow remote update to succeed which sounds a bit odd. Jan has
shared this and similar concerns about this resolution method, so I
have added him to the email as well.

> > >
> > > Conflict Types:
> > > ----------------
> > > a) update_differ: The origin of an incoming update's key row differs
> > > from the local row i.e.; the row has already been updated locally or
> > > by different nodes.
> > > b) update_missing: The row with the same value as that incoming
> > > update's key does not exist. Remote is trying to update a row which
> > > does not exist locally.
> > > c) update_deleted: The row with the same value as that incoming
> > > update's key does not exist. The row is already deleted. This conflict
> > > type is generated only if the deleted row is still detectable i.e., it
> > > is not removed by VACUUM yet. If the row is removed by VACUUM already,
> > > it cannot detect this conflict. It will detect it as update_missing
> > > and will follow the default or configured resolver of update_missing
> > > itself.
> > >
> >
> > I don't understand the why should update_missing or update_deleted be
> > different, especially considering it's not detected reliably. And also
> > that even if we happen to find the row the associated TOAST data may
> > have already been removed. So why would this matter?
>
> Here, we are trying to tackle the case where the row is 'recently'
> deleted i.e. concurrent UPDATE and DELETE on pub and sub. User may
> want to opt for a different resolution in such a case as against the
> one where the corresponding row was not even present in the first
> place. The case where the row was deleted long back may not fall into
> this category as there are higher chances that they have been removed
> by vacuum and can be considered equivalent to the update_ missing
> case.
>

I think to make 'update_deleted' work, we need another scan with a
different snapshot type to find the recently deleted row. I don't know
if it is a good idea to scan the index twice with different snapshots,
so for the sake of simplicity, can we consider 'updated_deleted' same
as 'update_missing'? If we think it is an important case to consider
then we can try to accomplish this once we finalize the
design/implementation of other resolution methods.

> > >
> > > To implement the above, subscription commands will be changed to have
> > > one more parameter 'conflict_resolution=on/off', default will be OFF.
> > >
> > > To configure global resolvers, new DDL command will be introduced:
> > >
> > > CONFLICT RESOLVER ON <conflict_type> IS <conflict_resolver>
> > >
> >
> > I very much doubt we want a single global conflict resolver, or even one
> > resolver per subscription. It seems like a very table-specific thing.
>

+1 to make it a table-level configuration but we probably need
something at the global level as well such that by default if users
don't define anything at table-level global-level configuration will
be used.

>
> >
> > Also, doesn't all this whole design ignore the concurrency between
> > publishers? Isn't this problematic considering the commit timestamps may
> > go backwards (for a given publisher), which means the conflict
> > resolution is not deterministic (as it depends on how exactly it
> > interleaves)?
> >

I am not able to imagine the cases you are worried about. Can you
please be specific? Is it similar to the case I described in
yesterday's email [1]?

[1] - https://www.postgresql.org/message-id/CAA4eK1JTMiBOoGqkt%3DaLPLU8Rs45ihbLhXaGHsz8XC76%2BOG3%2BQ%40mail.gmail.com

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Corey Huinker
Дата:
Сообщение: Re: Statistics Import and Export
Следующее
От: Kyotaro Horiguchi
Дата:
Сообщение: Re: pg_parse_json() should not leak token copies on failure