RE: Forget close an open relation in ReorderBufferProcessTXN()

Поиск
Список
Период
Сортировка
От osumi.takamichi@fujitsu.com
Тема RE: Forget close an open relation in ReorderBufferProcessTXN()
Дата
Msg-id OSBPR01MB48884DDB2C2FA4AAB1BCA0D3ED509@OSBPR01MB4888.jpnprd01.prod.outlook.com
обсуждение исходный текст
Ответ на Re: Forget close an open relation in ReorderBufferProcessTXN()  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: Forget close an open relation in ReorderBufferProcessTXN()  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thursday, May 13, 2021 7:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Thu, May 13, 2021 at 11:15 AM osumi.takamichi@fujitsu.com
> <osumi.takamichi@fujitsu.com> wrote:
> >
> > I tried the following scenarios for trying to reproduce this.
> > Scenario2:
> > (1) set up 1 publisher and 1 subscriber
> > (2) create table with user_catalog_table = true on the pub
> > (3) insert some data to this table
> > (4) create publication for the table on the pub
> > (5) create table with user_catalog_table = true on the sub
> > (6) create subscription on the sub
> > (7) add synchronous_standby_names to publisher's configuration and
> > restart the pub
> > (8) have a session to truncate the user_catalog_table on the pub
> >
> > Scenario 2 was successful.
> 
> Yeah, because pgoutput or for that matter even test_decoding doesn't
> acquire a lock on user catalog tables.
> 
> > Are these the scenario you have in mind, if not please let me know for
> > the missing steps.
> > I would like to reproduce the scenario and write a patch to fix this.
> 
> I don't think we can reproduce it with core plugins as they don't lock user
> catalog tables. 
OK. My current understanding about how the deadlock happens is below.

1. TRUNCATE command is performed on user_catalog_table.
2. TRUNCATE command locks the table and index with ACCESS EXCLUSIVE LOCK.
3. TRUNCATE waits for the subscriber's synchronization
    when synchronous_standby_names is set.
4. Here, the walsender stops, *if* it tries to acquire a lock on the user_catalog_table
    because the table where it wants to see is locked by the TRUNCATE already. 

If this is right, we need to go back to a little bit higher level discussion,
since whether we should hack any plugin to simulate the deadlock caused by user_catalog_table reference
with locking depends on the assumption if the plugin takes a lock on the user_catalog_table or not.
In other words, if the plugin does read only access to that type of table with no lock
(by RelationIdGetRelation for example ?), the deadlock concern disappears and we don't
need to add anything to plugin sides, IIUC.

Here, we haven't gotten any response about whether output plugin takes (should take)
the lock on the user_catalog_table. But, I would like to make a consensus
about this point before the implementation.

By the way, Amit-san already mentioned the main reason of this
is that we allow decoding of TRUNCATE operation for user_catalog_table in synchronous mode.
The choices are provided by Amit-san already in the past email in [1].
(1) disallow decoding of TRUNCATE operation for user_catalog_tables
(2) disallow decoding of any operation for user_catalog_tables like system catalog tables

Yet, I'm not sure if either option solves the deadlock concern completely.
If application takes an ACCESS EXCLUSIVE lock by LOCK command (not by TRUNCATE !)
on the user_catalog_table in a transaction, and if the plugin tries to take a lock on it,
I think the deadlock happens. Of course, having a consensus that the plugin takes no lock at all
would remove this concern, though. 

Like this, I'd like to discuss those two items in question together at first.
* the plugin should take a lock on user_catalog_table or not
* the range of decoding related to user_catalog_table

To me, taking no lock on the user_catalog_table from plugin is fine because
we have historical snapshot mechanism, which doesn't produce deadlock in any combination
even when application executes a LOCK command for ACCESS EXCLUSIVE.
In addition, I agree with the idea that we don't decode any operation on user_catalog_table
and have better alignment with usual system catalogs.

Thoughts ?

[1] - https://www.postgresql.org/message-id/CAA4eK1LP8xTysPEMEBYAZ%3D6KoMWfjyf0gzF-9Bp%3DSgVFvYQDVw%40mail.gmail.com



Best Regards,
    Takamichi Osumi


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Greg Nancarrow
Дата:
Сообщение: Re: Parallel scan with SubTransGetTopmostTransaction assert coredump
Следующее
От: Noah Misch
Дата:
Сообщение: Re: dump cannot be restored if schema permissions revoked