Обсуждение: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.

Поиск
Список
Период
Сортировка

BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      18267
Logged by:          song yutao
Email address:      sytoptimisprime@163.com
PostgreSQL version: 15.5
Operating system:   Linux
Description:

Hi hackers, I found when insert plenty of data into a table, and add the
table to publication (through Alter Publication) meanwhile, it's likely that
the incremental data cannot be synchronized to the subscriber. Here is my
test method:

1. On publisher and subscriber, create table for test:
CREATE TABLE tab_1 (a int);

2. Setup logical replication:
on publisher:  
     SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false,
false); 
                       CREATE PUBLICATION tap_pub;
on subscriber:  
     CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION
tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1')

3. Perform Insert:     
     for (my $i = 1; $i <= 1000; $i++) {
         $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT
generate_series(1, 1000)");
     }
     Each transaction contains 1000 insertion, and 1000 transactions are in
total.

4. When performing step 3, add table tab_1  to publication.
     ALTER PUBLICATION tap_pub ADD TABLE tab_1
     ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION

The root cause of the problem is as follows:
pgoutput relies on the invalidation mechanism to validate publications. When
walsender decoding an Alter Publication transaction, catalog caches are
invalidated at once. Furthermore, since pg_publication_rel is modified,
snapshot changes are added to all transactions currently being decoded. For
other transactions, catalog caches have been invalidated. However, it is
likely that the snapshot changes have not yet been decoded. In pgoutput
implementation, these transactions query the system table pg_publication_rel
to determine whether to publish changes made in transactions. In this case,
catalog tuples are not found because snapshot has not been updated. As a
result, changes in transactions are considered not to be published, and
subsequent data cannot be synchronized.

I think it's necessary to add invalidations to other transactions after
adding a snapshot change to them.
Therefore, I submitted a patch for this bug.


On Wed, Jan 3, 2024 at 9:51 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference:      18267
> Logged by:          song yutao
> Email address:      sytoptimisprime@163.com
> PostgreSQL version: 15.5
> Operating system:   Linux
> Description:
>
> Hi hackers, I found when insert plenty of data into a table, and add the
> table to publication (through Alter Publication) meanwhile, it's likely that
> the incremental data cannot be synchronized to the subscriber. Here is my
> test method:
>
> 1. On publisher and subscriber, create table for test:
> CREATE TABLE tab_1 (a int);
>
> 2. Setup logical replication:
> on publisher:
>      SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false,
> false);
>                        CREATE PUBLICATION tap_pub;
> on subscriber:
>      CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION
> tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1')
>
> 3. Perform Insert:
>      for (my $i = 1; $i <= 1000; $i++) {
>          $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT
> generate_series(1, 1000)");
>      }
>      Each transaction contains 1000 insertion, and 1000 transactions are in
> total.
>
> 4. When performing step 3, add table tab_1  to publication.
>      ALTER PUBLICATION tap_pub ADD TABLE tab_1
>      ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION
>
> The root cause of the problem is as follows:
> pgoutput relies on the invalidation mechanism to validate publications. When
> walsender decoding an Alter Publication transaction, catalog caches are
> invalidated at once. Furthermore, since pg_publication_rel is modified,
> snapshot changes are added to all transactions currently being decoded. For
> other transactions, catalog caches have been invalidated. However, it is
> likely that the snapshot changes have not yet been decoded. In pgoutput
> implementation, these transactions query the system table pg_publication_rel
> to determine whether to publish changes made in transactions. In this case,
> catalog tuples are not found because snapshot has not been updated. As a
> result, changes in transactions are considered not to be published, and
> subsequent data cannot be synchronized.
>

As per my understanding, we distribute snapshot to other transactions
at commit time (LSN) which means in your case at the time of commit
for "ALTER PUBLICATION tap_pub ADD TABLE tab_1". So any changes after
that should see the changes in pg_publication_rel.

> I think it's necessary to add invalidations to other transactions after
> adding a snapshot change to them.
> Therefore, I submitted a patch for this bug.
>

Sorry, I didn't understand your proposal and I don't see any patch
attached as you are claiming in the last sentence.

--
With Regards,
Amit Kapila.



On Wed, Jan 3, 2024 at 9:51 PM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference:      18267
> Logged by:          song yutao
> Email address:      sytoptimisprime@163.com
> PostgreSQL version: 15.5
> Operating system:   Linux
> Description:
>
> Hi hackers, I found when insert plenty of data into a table, and add the
> table to publication (through Alter Publication) meanwhile, it's likely that
> the incremental data cannot be synchronized to the subscriber. Here is my
> test method:
>
> 1. On publisher and subscriber, create table for test:
> CREATE TABLE tab_1 (a int);
>
> 2. Setup logical replication:
> on publisher:
>      SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false,
> false);
>                        CREATE PUBLICATION tap_pub;
> on subscriber:
>      CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr' PUBLICATION
> tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1')
>
> 3. Perform Insert:
>      for (my $i = 1; $i <= 1000; $i++) {
>          $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT
> generate_series(1, 1000)");
>      }
>      Each transaction contains 1000 insertion, and 1000 transactions are in
> total.
>
> 4. When performing step 3, add table tab_1  to publication.
>      ALTER PUBLICATION tap_pub ADD TABLE tab_1
>      ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION

I could not reproduce this issue.  Can you tell me exactly which data
were missing for you?  When you add a table to the publication and
refresh, and as soon as you identify that the table is part of the
publication and send the first commit which contains the changes for
the table it will identify that the table state is not yet SYNC READY
and then it will trigger a sync worker and via that it should be able
to get all the previous data for that table.

> The root cause of the problem is as follows:
> pgoutput relies on the invalidation mechanism to validate publications. When
> walsender decoding an Alter Publication transaction, catalog caches are
> invalidated at once. Furthermore, since pg_publication_rel is modified,
> snapshot changes are added to all transactions currently being decoded. For
> other transactions, catalog caches have been invalidated. However, it is
> likely that the snapshot changes have not yet been decoded. In pgoutput
> implementation, these transactions query the system table pg_publication_rel
> to determine whether to publish changes made in transactions. In this case,
> catalog tuples are not found because snapshot has not been updated. As a
> result, changes in transactions are considered not to be published, and
> subsequent data cannot be synchronized.
>
> I think it's necessary to add invalidations to other transactions after
> adding a snapshot change to them.
> Therefore, I submitted a patch for this bug.

I think you missed attaching the patch, as Amit also pointed out.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



RE: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Song,

> 
> Hi hackers, I found when insert plenty of data into a table, and add the
> table to publication (through Alter Publication) meanwhile, it's likely that
> the incremental data cannot be synchronized to the subscriber. Here is my
> test method:

Good catch.

> 1. On publisher and subscriber, create table for test:
> CREATE TABLE tab_1 (a int);
> 
> 2. Setup logical replication:
> on publisher:
>      SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false,
> false);
>                        CREATE PUBLICATION tap_pub;
> on subscriber:
>      CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr'
> PUBLICATION
> tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1')
> 
> 3. Perform Insert:
>      for (my $i = 1; $i <= 1000; $i++) {
>          $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT
> generate_series(1, 1000)");
>      }
>      Each transaction contains 1000 insertion, and 1000 transactions are in
> total.
> 
> 4. When performing step 3, add table tab_1  to publication.
>      ALTER PUBLICATION tap_pub ADD TABLE tab_1
>      ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION

I could reproduce the failure. PSA the script.

In the script, ALTER PUBLICATION was executed while doing the initial data sync.
(The workload is almost same as what you reporter posted, but number of rows are reduced)

In total, 4000 tuples are inserted on publisher. However, after sometime, only 2500 tuples are replicated.

```
publisher=# SELECT count(*) FROM tab_1 ;
 count 
-------
 40000
(1 row)

subscriber=# SELECT count(*) FROM tab_1 ;
 count 
-------
 25000
(1 row)
```

Is it same failure you saw?

> The root cause of the problem is as follows:
> pgoutput relies on the invalidation mechanism to validate publications. When
> walsender decoding an Alter Publication transaction, catalog caches are
> invalidated at once. Furthermore, since pg_publication_rel is modified,
> snapshot changes are added to all transactions currently being decoded. For
> other transactions, catalog caches have been invalidated. However, it is
> likely that the snapshot changes have not yet been decoded. In pgoutput
> implementation, these transactions query the system table pg_publication_rel
> to determine whether to publish changes made in transactions. In this case,
> catalog tuples are not found because snapshot has not been updated. As a
> result, changes in transactions are considered not to be published, and
> subsequent data cannot be synchronized.
>
> I think it's necessary to add invalidations to other transactions after
> adding a snapshot change to them.
> Therefore, I submitted a patch for this bug.

I cannot see your attaching, but I found that proposed patch in [1] can solve
the issue. After applying 0001 + 0002 + 0003 (open relations as ShareRowExclusiveLock,
in OpenTableList), the data gap was removed. Thought?

[1]: https://www.postgresql.org/message-id/de52b282-1166-1180-45a2-8d8917ca74c6%40enterprisedb.com

Best Regards,
Hayato Kuroda
FUJITSU LIMITED


Вложения
On Fri, Jan 5, 2024 at 9:25 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Song,
>
> >
> > Hi hackers, I found when insert plenty of data into a table, and add the
> > table to publication (through Alter Publication) meanwhile, it's likely that
> > the incremental data cannot be synchronized to the subscriber. Here is my
> > test method:
>
> Good catch.
>
> > 1. On publisher and subscriber, create table for test:
> > CREATE TABLE tab_1 (a int);
> >
> > 2. Setup logical replication:
> > on publisher:
> >      SELECT pg_create_logical_replication_slot('slot1', 'pgoutput', false,
> > false);
> >                        CREATE PUBLICATION tap_pub;
> > on subscriber:
> >      CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr'
> > PUBLICATION
> > tap_pub WITH (enabled = true, create_slot = false, slot_name='slot1')
> >
> > 3. Perform Insert:
> >      for (my $i = 1; $i <= 1000; $i++) {
> >          $node_publisher->safe_psql('postgres', "INSERT INTO tab_1 SELECT
> > generate_series(1, 1000)");
> >      }
> >      Each transaction contains 1000 insertion, and 1000 transactions are in
> > total.
> >
> > 4. When performing step 3, add table tab_1  to publication.
> >      ALTER PUBLICATION tap_pub ADD TABLE tab_1
> >      ALTER SUBSCRIPTION tap_sub REFRESH PUBLICATION
>
> I could reproduce the failure. PSA the script.
>
> In the script, ALTER PUBLICATION was executed while doing the initial data sync.
> (The workload is almost same as what you reporter posted, but number of rows are reduced)
>
> In total, 4000 tuples are inserted on publisher. However, after sometime, only 2500 tuples are replicated.
>
> ```
> publisher=# SELECT count(*) FROM tab_1 ;
>  count
> -------
>  40000
> (1 row)
>
> subscriber=# SELECT count(*) FROM tab_1 ;
>  count
> -------
>  25000
> (1 row)
> ```
>
> Is it same failure you saw?

With your attached script I was able to see this gap, I didn't dig
deeper but with the initial investigation, I could see that even after
ALTER PUBLICATION, the pgoutput_change continues to see
'relentry->pubactions.pubinsert' as false, even after re fetching the
relation entry after the invalidation.  That shows the invalidation
framework might be working fine but we are using the older snapshot to
fetch the entry.  I did not debug it further why it is not getting the
updated snapshot which can see the change in publication, because I
assume Yutao Song as already analyzed that as per his first email, so
I would wait for his patch.

> > The root cause of the problem is as follows:
> > pgoutput relies on the invalidation mechanism to validate publications. When
> > walsender decoding an Alter Publication transaction, catalog caches are
> > invalidated at once. Furthermore, since pg_publication_rel is modified,
> > snapshot changes are added to all transactions currently being decoded. For
> > other transactions, catalog caches have been invalidated. However, it is
> > likely that the snapshot changes have not yet been decoded. In pgoutput
> > implementation, these transactions query the system table pg_publication_rel
> > to determine whether to publish changes made in transactions. In this case,
> > catalog tuples are not found because snapshot has not been updated. As a
> > result, changes in transactions are considered not to be published, and
> > subsequent data cannot be synchronized.
> >
> > I think it's necessary to add invalidations to other transactions after
> > adding a snapshot change to them.
> > Therefore, I submitted a patch for this bug.
>
> I cannot see your attaching, but I found that proposed patch in [1] can solve
> the issue. After applying 0001 + 0002 + 0003 (open relations as ShareRowExclusiveLock,
> in OpenTableList), the data gap was removed. Thought?

Not sure why 'open relations as ShareRowExclusiveLock' would help in
this case? have you investigated that?

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



RE: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Dilip,

> > > The root cause of the problem is as follows:
> > > pgoutput relies on the invalidation mechanism to validate publications. When
> > > walsender decoding an Alter Publication transaction, catalog caches are
> > > invalidated at once. Furthermore, since pg_publication_rel is modified,
> > > snapshot changes are added to all transactions currently being decoded. For
> > > other transactions, catalog caches have been invalidated. However, it is
> > > likely that the snapshot changes have not yet been decoded. In pgoutput
> > > implementation, these transactions query the system table pg_publication_rel
> > > to determine whether to publish changes made in transactions. In this case,
> > > catalog tuples are not found because snapshot has not been updated. As a
> > > result, changes in transactions are considered not to be published, and
> > > subsequent data cannot be synchronized.
> > >
> > > I think it's necessary to add invalidations to other transactions after
> > > adding a snapshot change to them.
> > > Therefore, I submitted a patch for this bug.
> >
> > I cannot see your attaching, but I found that proposed patch in [1] can solve
> > the issue. After applying 0001 + 0002 + 0003 (open relations as
> ShareRowExclusiveLock,
> > in OpenTableList), the data gap was removed. Thought?
> 
> Not sure why 'open relations as ShareRowExclusiveLock' would help in
> this case? have you investigated that?
>

Sorry for confusing, I did not analyze because I thought Song will tell us the
reason with his patch. I have just reported my tested... I will spend time later
based on the requirement.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED

RE: Re:Re: BUG #18267: Logical replication bug: data is not synchronized after Alter Publication.

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Song,

>
I see that snapshot changes is added to the tail of txn->changes list in
SnapBuildDistributeNewCatalogSnapshot. When the DML transaction is
committed, the corresponding entry in RelationSyncCache has already been
invalidated. However, snapshot hasn't been changed as this time (The newly added
snapshot change is still in the list and hasn't been invoked.), so as you say: "
pgoutput_change continues to see 'relentry->pubactions.pubinsert' as false,
even after re fetching the relation entry after the invalidation."

Therefore, there is one solution, as Andres Freund[1] states, that if a transaction is
found to have modified catalog during logical decoding, invalidations are forced to
be assigned to all concurrent in-progress transactions, just as my patch did.

[1] https://www.postgresql.org/message-id/20231118025445.crhaeeuvoe2g5dv6%40awork3.anarazel.de
>

Thanks for making a patch, but your patch cannot be built:

```
reorderbuffer.c: In function ‘ReorderBufferDistributeInvalidation’:
../../../../src/include/lib/ilist.h:594:2: error: expected identifier before ‘(’ token
  (AssertVariableIsOfTypeMacro(ptr, dlist_node *),      \
  ^
reorderbuffer.c:2648:8: note: in expansion of macro ‘dlist_container’
   txn->dlist_container(ReorderBufferTXN, node, txn_i.cur);
        ^~~~~~~~~~~~~~~
```

Based on your arguments, I revised your patch. Is it same as your expectations?
This also fixes some coding conventions.

>
Although my approach seems to be effective,
>

I ran my reproducer and confirmed that the issue was fixed.

>
it may bring additional overhead to logical decoding
under special circumstances. I think upgrading lock mentioned in [1] is also an effective solution,
and it seems to have less impact. But I don't quite understand why it may cause deadlocks mentioned
in [1].
>

IIUC, no one pointed out a concrete case for deadlocks, but anyone could not say
that it is safe, neither. We must understand the thread more and must decide how
we fix: modify the lock level or send invalidation messages to all the reorderbuffer
transactions. How do you think?

Best Regards,
Hayato Kuroda
FUJITSU LIMITED




Вложения