Re: Conflict Detection and Resolution

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Conflict Detection and Resolution
Дата
Msg-id CAA4eK1+d5M0pnwsWa1FWakrwi6dMoeNfQx_4eGn+C7C5wh6UEw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Conflict Detection and Resolution  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: Conflict Detection and Resolution
Список pgsql-hackers
On Thu, Jun 13, 2024 at 11:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 5, 2024 at 3:32 PM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > Hi,
> >
> > This time at PGconf.dev[1], we had some discussions regarding this
> > project. The proposed approach is to split the work into two main
> > components. The first part focuses on conflict detection, which aims to
> > identify and report conflicts in logical replication. This feature will
> > enable users to monitor the unexpected conflicts that may occur. The
> > second part involves the actual conflict resolution. Here, we will provide
> > built-in resolutions for each conflict and allow user to choose which
> > resolution will be used for which conflict(as described in the initial
> > email of this thread).
>
> I agree with this direction that we focus on conflict detection (and
> logging) first and then develop conflict resolution on top of that.
>
> >
> > Of course, we are open to alternative ideas and suggestions, and the
> > strategy above can be changed based on ongoing discussions and feedback
> > received.
> >
> > Here is the patch of the first part work, which adds a new parameter
> > detect_conflict for CREATE and ALTER subscription commands. This new
> > parameter will decide if subscription will go for conflict detection. By
> > default, conflict detection will be off for a subscription.
> >
> > When conflict detection is enabled, additional logging is triggered in the
> > following conflict scenarios:
> >
> > * updating a row that was previously modified by another origin.
> > * The tuple to be updated is not found.
> > * The tuple to be deleted is not found.
> >
> > While there exist other conflict types in logical replication, such as an
> > incoming insert conflicting with an existing row due to a primary key or
> > unique index, these cases already result in constraint violation errors.
>
> What does detect_conflict being true actually mean to users? I
> understand that detect_conflict being true could introduce some
> overhead to detect conflicts. But in terms of conflict detection, even
> if detect_confict is false, we detect some conflicts such as
> concurrent inserts with the same key. Once we introduce the complete
> conflict detection feature, I'm not sure there is a case where a user
> wants to detect only some particular types of conflict.
>

You are right that users would wish to detect the conflicts and
probably the extra effort would only be in the 'update_differ' case
where we need to consult committs module and that we will only do when
'track_commit_timestamp' is true. BTW, I think for Inserts with
primary/unique key violation, we should catch the ERROR and log it. If
we want to log the conflicts in a separate table then do we want to do
that in the catch block after getting pk violation or do an extra scan
before 'INSERT' to find the conflict? I think logging would need extra
cost especially if we want to LOG it in some table as you are
suggesting below that may need some option.

> > Therefore, additional conflict detection for these cases is currently
> > omitted to minimize potential overhead. However, the pre-detection for
> > conflict in these error cases is still essential to support automatic
> > conflict resolution in the future.
>
> I feel that we should log all types of conflict in an uniform way. For
> example, with detect_conflict being true, the update_differ conflict
> is reported as "conflict %s detected on relation "%s"", whereas
> concurrent inserts with the same key is reported as "duplicate key
> value violates unique constraint "%s"", which could confuse users.
> Ideally, I think that we log such conflict detection details (table
> name, column name, conflict type, etc) to somewhere (e.g. a table or
> server logs) so that the users can resolve them manually.
>

It is good to think if there is a value in providing in
pg_conflicts_history kind of table which will have details of
conflicts that occurred and then we can extend it to have resolutions.
I feel we can anyway LOG the conflicts by default. Updating a separate
table with conflicts should be done by default or with a knob is a
point to consider.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: RFC: adding pytest as a supported test framework
Следующее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: CI and test improvements