Обсуждение: Logical replication claims to work, not working - new tables

Поиск
Список
Период
Сортировка

Logical replication claims to work, not working - new tables

От
Chris Angelico
Дата:
PostgreSQL 15 on Debian, both ends of replication.

I'm doing logical replication in a bit of a complex setup. Not sure
how much of this is relevant so I'll give you a lot of detail; sorry
if a lot of this is just noise.

* Bidirectional alternating-master replication. Since I'm still on PG
15, the replication is up in one direction, down in the other, rather
than actually being simultaneous.
* Replication is defined as "for all tables". All relevant tables are
in a single schema, "stillebot".
* Replication was working fine on initial deployment, including a swap
of master/slave.
* One table was created without a primary key, and subsequently
altered to have "replica identity full".
* Replication is not working for this table (stillebot.settings) but
is working for other tables.

I tried restarting Postgres on the slave and monitoring
/var/log/postgresql/postgresql-15-main.log and it all seemed happy.
According to pg_subscription_rel, all tables are in their
"replication" phase:

select srsubstate,srsublsn,relname from pg_subscription_rel join
pg_class on srrelid=oid;
 srsubstate | srsublsn |         relname
------------+----------+--------------------------
 r          |          | user_followed_categories
 r          |          | config
 r          |          | config_exportable
 r          |          | commands
 r          |          | settings
(5 rows)

If I make a change to user_followed_categories, it works fine. If I
make a change to settings, it does not replicate.

The stillebot.config table seems somewhat desynchronized - rows are
missing - but if I INSERT a new row into it, it gets properly
replicated.

How can I dig into this to determine what's going on? Have I broken
things by creating a table without a primary key? (It's a singleton
table, will only ever have one row in it; could add a meaningless PK
if it helps the replication.) Are there other logs to inspect?

Thanks in advance!

ChrisA



Re: Logical replication claims to work, not working - new tables

От
Justin
Дата:


On Sat, Jan 20, 2024, 5:43 PM Chris Angelico <rosuav@gmail.com> wrote:
PostgreSQL 15 on Debian, both ends of replication.

I'm doing logical replication in a bit of a complex setup. Not sure
how much of this is relevant so I'll give you a lot of detail; sorry
if a lot of this is just noise.

* Bidirectional alternating-master replication. Since I'm still on PG
15, the replication is up in one direction, down in the other, rather
than actually being simultaneous.
* Replication is defined as "for all tables". All relevant tables are
in a single schema, "stillebot".
* Replication was working fine on initial deployment, including a swap
of master/slave.
* One table was created without a primary key, and subsequently
altered to have "replica identity full".
* Replication is not working for this table (stillebot.settings) but
is working for other tables.

I tried restarting Postgres on the slave and monitoring
/var/log/postgresql/postgresql-15-main.log and it all seemed happy.
According to pg_subscription_rel, all tables are in their
"replication" phase:

select srsubstate,srsublsn,relname from pg_subscription_rel join
pg_class on srrelid=oid;
 srsubstate | srsublsn |         relname
------------+----------+--------------------------
 r          |          | user_followed_categories
 r          |          | config
 r          |          | config_exportable
 r          |          | commands
 r          |          | settings
(5 rows)

If I make a change to user_followed_categories, it works fine. If I
make a change to settings, it does not replicate.

The stillebot.config table seems somewhat desynchronized - rows are
missing - but if I INSERT a new row into it, it gets properly
replicated.

How can I dig into this to determine what's going on? Have I broken
things by creating a table without a primary key? (It's a singleton
table, will only ever have one row in it; could add a meaningless PK
if it helps the replication.) Are there other logs to inspect?

Thanks in advance!

ChrisA
 
When using replica set to full this kicks off  a  full table scan for each update or delete this is very expensive.  If there are no errors being reported you will find it is working but hung doing full scans.  Inserts are just appended to end of heap.


You can copy the replication slot on the primary to peak at which transactions LR is hung on to confirm.

Adding a primary key will fix this issue.  Note PG 16 can use indexes to find qualifying rows when a table's replica is set to full.

Re: Logical replication claims to work, not working - new tables

От
Chris Angelico
Дата:
On Mon, 22 Jan 2024 at 05:25, Justin <zzzzz.graf@gmail.com> wrote:
>
> When using replica set to full this kicks off  a  full table scan for each update or delete this is very expensive.
Ifthere are no errors being reported you will find it is working but hung doing full scans.  Inserts are just appended
toend of heap.
 
>

That shouldn't be too costly, since this table only has a single row
in it. But it never finishes the replication at all.

>
> You can copy the replication slot on the primary to peak at which transactions LR is hung on to confirm.
>
> Adding a primary key will fix this issue.  Note PG 16 can use indexes to find qualifying rows when a table's replica
isset to full.
 

Ah, sadly PG 16 isn't an option as yet. It would definitely make
dual-master replication somewhat easier.

I'll try dropping the table, creating it again with a PK, and seeing
how it goes. Thanks.

ChrisA



Re: Logical replication claims to work, not working - new tables

От
Chris Angelico
Дата:
On Mon, 22 Jan 2024 at 05:50, Chris Angelico <rosuav@gmail.com> wrote:
>
> On Mon, 22 Jan 2024 at 05:25, Justin <zzzzz.graf@gmail.com> wrote:
> > Adding a primary key will fix this issue.  Note PG 16 can use indexes to find qualifying rows when a table's
replicais set to full.
 
>
> I'll try dropping the table, creating it again with a PK, and seeing
> how it goes. Thanks.
>

Okay, it seems to be working now. Thanks! Hopefully that was the only issue.

I'm a bit surprised by this, the docs do seem to imply that it ought
to work (albeit inefficiently). Maybe I made it worse by initially
creating the table without specifying the replica identity, and had to
alter it in afterwards?

ChrisA