Обсуждение: BUG #17670: Logical Replication data may be lost on the subscription under certain scenarios
BUG #17670: Logical Replication data may be lost on the subscription under certain scenarios
От
PG Bug reporting form
Дата:
The following bug has been logged on the website: Bug reference: 17670 Logged by: Yunhe Xu Email address: xyh@nvn.xyz PostgreSQL version: 14.4 Operating system: rhel 7 Description: Logical Replication data may be lost on the subscription under certain scenarios. The following is review process. * Logical Replication Information t1=# select * from pg_publication_tables ; pubname | schemaname | tablename ------------+------------+---------------- pub1 | public | t_test1 t2=# select srrelid::regclass,* from pg_subscription_rel ; srrelid | srsubid | srrelid | srsubstate | srsublsn ---------+---------+---------+------------+----------- t_test1 | 57551 | 41170 | r | 0/696E418 t1=# select application_name,state from pg_stat_replication where application_name='test2_sub'; application_name | state ------------------+----------- test2_sub | streaming * Verify the status is normal t1=# insert into t_test1 values (1); INSERT 0 1 t1=# select * from t_test1; id ---- 1 t2=# select * from t_test1; id ---- 1 * Then delete this table on subscription : t2=# alter table t_test1 rename TO t_test2; ALTER TABLE * Now,do DMLs t1=# insert into t_test1 values (2); INSERT 0 1 t1=# delete from t_test1 where id=1; DELETE 1 t1=# * The log gives some errors 2022-10-28 15:14:07.919 CST,,,2600,,635b813f.a28,2,,2022-10-28 15:14:07 CST,4/12,0,ERROR,55000,"logical replication target relation ""public.t_test1"" does not exist",,,,,,,,,"","logical replication worker",,0 * OK,Let me create this table t2=# create table t_test1 (id int PRIMARY KEY); CREATE TABLE * At this point, the log is no longer showing errors.But the incremental data is lost. t2=# select * from t_test1; id ---- (0 rows) t1=# insert into t_test1 values (3); INSERT 0 1 t1=# insert into t_test1 values (4); INSERT 0 1 t2=# select * from t_test1; id ---- (0 rows) t2=# select * from t_test2 ; id ---- 1 (1 row) * Still no error. * Now modify the previous table back t2=# drop table t_test1 ; DROP TABLE t2=# alter table t_test2 rename TO t_test1; ALTER TABLE t1=# insert into t_test1 values (5); INSERT 0 1 t1=# select * from t_test1; id ---- 2 3 4 5 (4 rows) t2=# select * from t_test1; id ---- 1 5 (2 rows) * The middle operation-delete id=1 and insert id=3,4-is loss. *Please confirm if this is a BUG. Thanks. Yunhe Xu
On Fri, 28 Oct 2022 at 15:48, PG Bug reporting form <noreply@postgresql.org> wrote: > The following bug has been logged on the website: > > Bug reference: 17670 > Logged by: Yunhe Xu > Email address: xyh@nvn.xyz > PostgreSQL version: 14.4 > Operating system: rhel 7 > Description: > > Logical Replication data may be lost on the subscription under certain > scenarios. > The following is review process. > > * Logical Replication Information > t1=# select * from pg_publication_tables ; > pubname | schemaname | tablename > ------------+------------+---------------- > pub1 | public | t_test1 > t2=# select srrelid::regclass,* from pg_subscription_rel ; > srrelid | srsubid | srrelid | srsubstate | srsublsn > ---------+---------+---------+------------+----------- > t_test1 | 57551 | 41170 | r | 0/696E418 > t1=# select application_name,state from pg_stat_replication where > application_name='test2_sub'; > application_name | state > ------------------+----------- > test2_sub | streaming > > * Verify the status is normal > t1=# insert into t_test1 values (1); > INSERT 0 1 > t1=# select * from t_test1; > id > ---- > 1 > t2=# select * from t_test1; > id > ---- > 1 > > * Then delete this table on subscription : > t2=# alter table t_test1 rename TO t_test2; > ALTER TABLE > > * Now,do DMLs > t1=# insert into t_test1 values (2); > INSERT 0 1 > t1=# delete from t_test1 where id=1; > DELETE 1 > t1=# > * The log gives some errors > 2022-10-28 15:14:07.919 CST,,,2600,,635b813f.a28,2,,2022-10-28 15:14:07 > CST,4/12,0,ERROR,55000,"logical replication target relation > ""public.t_test1"" does not exist",,,,,,,,,"","logical replication > worker",,0 > > * OK,Let me create this table > t2=# create table t_test1 (id int PRIMARY KEY); > CREATE TABLE > > * At this point, the log is no longer showing errors.But the incremental > data is lost. > t2=# select * from t_test1; > id > ---- > (0 rows) > > t1=# insert into t_test1 values (3); > INSERT 0 1 > t1=# insert into t_test1 values (4); > INSERT 0 1 > > t2=# select * from t_test1; > id > ---- > (0 rows) > t2=# select * from t_test2 ; > id > ---- > 1 > (1 row) > > * Still no error. > * Now modify the previous table back > > t2=# drop table t_test1 ; > DROP TABLE > t2=# alter table t_test2 rename TO t_test1; > ALTER TABLE > > t1=# insert into t_test1 values (5); > INSERT 0 1 > t1=# select * from t_test1; > id > ---- > 2 > 3 > 4 > 5 > (4 rows) > > t2=# select * from t_test1; > id > ---- > 1 > 5 > (2 rows) > > * The middle operation-delete id=1 and insert id=3,4-is loss. > *Please confirm if this is a BUG. > I can reproduce it on HEAD. Here is my analysis: When we rename the t_test1 to t_test2, the subscriber doesn't have a table matched publication table name, so the logical replication throw an error. Then, we create a new t_test1 on subscriber, the logical replication worker can find the table that matches the published name. However, the pg_subscription_rel hasn't updated, and when we try to get the subscription state through GetSubscriptionRelState(), it cannot find a matched subscription relation mapping, so the WAL cannot apply to the new table t_test1. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
On Fri, Oct 28, 2022 at 8:07 PM Japin Li <japinli@hotmail.com> wrote: > I can reproduce it on HEAD. Here is my analysis: > > When we rename the t_test1 to t_test2, the subscriber doesn't have a > table matched publication table name, so the logical replication throw > an error. Then, we create a new t_test1 on subscriber, the logical > replication worker can find the table that matches the published name. > However, the pg_subscription_rel hasn't updated, and when we try to get > the subscription state through GetSubscriptionRelState(), it cannot find > a matched subscription relation mapping, so the WAL cannot apply to the > new table t_test1. I am just wondering if it is correct behavior to allow renaming the table used by a subscription, or should there be some dependency? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Sun, 30 Oct 2022 at 14:39, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Fri, Oct 28, 2022 at 8:07 PM Japin Li <japinli@hotmail.com> wrote: > >> I can reproduce it on HEAD. Here is my analysis: >> >> When we rename the t_test1 to t_test2, the subscriber doesn't have a >> table matched publication table name, so the logical replication throw >> an error. Then, we create a new t_test1 on subscriber, the logical >> replication worker can find the table that matches the published name. >> However, the pg_subscription_rel hasn't updated, and when we try to get >> the subscription state through GetSubscriptionRelState(), it cannot find >> a matched subscription relation mapping, so the WAL cannot apply to the >> new table t_test1. > > I am just wondering if it is correct behavior to allow renaming the > table used by a subscription, or should there be some dependency? Maybe we can add a dependency to make the user know what they are doing. I also want to know when we should add a dependency? -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
On Sun, Oct 30, 2022 at 7:22 PM Japin Li <japinli@hotmail.com> wrote: > > I am just wondering if it is correct behavior to allow renaming the > > table used by a subscription, or should there be some dependency? > > Maybe we can add a dependency to make the user know what they are doing. > I also want to know when we should add a dependency? I haven't thought about it in deep, but I think whenever we add a new entry to pg_subscription_rel, that time maybe we could add a dependency on the respective subscription entry IMHO. But I am just wondering is there any reason why we are not already having such dependency? -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
On Fri, Oct 28, 2022 at 2:51 PM PG Bug reporting form <noreply@postgresql.org> wrote: > > The following bug has been logged on the website: > > Bug reference: 17670 > Logged by: Yunhe Xu > Email address: xyh@nvn.xyz > PostgreSQL version: 14.4 > Operating system: rhel 7 > Description: > > Logical Replication data may be lost on the subscription under certain > scenarios. > The following is review process. > > * Logical Replication Information > t1=# select * from pg_publication_tables ; > pubname | schemaname | tablename > ------------+------------+---------------- > pub1 | public | t_test1 > t2=# select srrelid::regclass,* from pg_subscription_rel ; > srrelid | srsubid | srrelid | srsubstate | srsublsn > ---------+---------+---------+------------+----------- > t_test1 | 57551 | 41170 | r | 0/696E418 > t1=# select application_name,state from pg_stat_replication where > application_name='test2_sub'; > application_name | state > ------------------+----------- > test2_sub | streaming > > * Verify the status is normal > t1=# insert into t_test1 values (1); > INSERT 0 1 > t1=# select * from t_test1; > id > ---- > 1 > t2=# select * from t_test1; > id > ---- > 1 > > * Then delete this table on subscription : > t2=# alter table t_test1 rename TO t_test2; > ALTER TABLE > > * Now,do DMLs > t1=# insert into t_test1 values (2); > INSERT 0 1 > t1=# delete from t_test1 where id=1; > DELETE 1 > t1=# > * The log gives some errors > 2022-10-28 15:14:07.919 CST,,,2600,,635b813f.a28,2,,2022-10-28 15:14:07 > CST,4/12,0,ERROR,55000,"logical replication target relation > ""public.t_test1"" does not exist",,,,,,,,,"","logical replication > worker",,0 > > * OK,Let me create this table > t2=# create table t_test1 (id int PRIMARY KEY); > CREATE TABLE > > * At this point, the log is no longer showing errors.But the incremental > data is lost. > If you do "Alter Subscription test2_sub Refresh Publication;" then the data should be synced. -- With Regards, Amit Kapila.
On Mon, Oct 31, 2022 at 2:24 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Sun, Oct 30, 2022 at 7:22 PM Japin Li <japinli@hotmail.com> wrote: > > > > I am just wondering if it is correct behavior to allow renaming the > > > table used by a subscription, or should there be some dependency? > > > > Maybe we can add a dependency to make the user know what they are doing. > > I also want to know when we should add a dependency? > > I haven't thought about it in deep, but I think whenever we add a new > entry to pg_subscription_rel, that time maybe we could add a > dependency on the respective subscription entry IMHO. But I am just > wondering is there any reason why we are not already having such > dependency? > As per our current implementation, users need to be careful with DDL operations on tables involved in subscriptions to ensure smooth replication. In the example shared in this thread, say if the user wants to rename tables on both publisher and subscriber then the case reported would have worked without hassles, and disallowing Alter Table would have unnecessarily made that operation difficult. Similarly, say if the user wants to add a primary key on the table on both publisher and subscriber, then it would be quite easy with the current system but adding dependencies could be additional work for the user and some people can complain about the same. -- With Regards, Amit Kapila.