Обсуждение: Logical replication and pg_dump for out of band synchronization

Поиск
Список
Период
Сортировка

Logical replication and pg_dump for out of band synchronization

От
Joseph Hammerman
Дата:
Hi pgsql-admins,

We have been trying to use Logical Replication to synchronize some oversized tables (1Tb+).
We are replicating from 9.6 -> 11.x. The long sync times for the initial snapshots of these large tables  have been causing incidents however, since autovacuum cannot clean up anything older than the xmin horizon.

We are considering initiating logical replication on the Producer with no Subscriber in order to generate a point in time from which logical changes for the table in question will begin to accumulate in pg_xlog on the primary.

We then intend to take a dump, restore the table, and play back the changes by enabling the Subscription. This way the bulk data transfer is out of band from the production application. Our testing shows that this works cleanly, and that new changes replicate correctly to the target relations. Additionally, pg_dump has a —snapshot flag which it appears was added to support this sort of workflow.

I would like to partition these relations in order to reduce the size of the individual objects to something more tractable, but that Data Migration is not feasible in the timeframe in which we have to be able to deliver this workflow split out.

Does anyone know if there is a dire reason from a Consistency standpoint not to use this approach?

Thanks in advance for any assistance anyone can provide!
Joe Hammerman

Re: Logical replication and pg_dump for out of band synchronization

От
Peter Eisentraut
Дата:
On 04.07.22 22:40, Joseph Hammerman wrote:
> We have been trying to use Logical Replication to synchronize some 
> oversized tables (1Tb+).
> We are replicating from 9.6 -> 11.x. The long sync times for the initial 
> snapshots of these large tables  have been causing incidents however, 
> since autovacuum cannot clean up anything older than the xmin horizon.

What replication system are you using with PG 9.6?  If you are using 
pglogical, then it contains a program pglogical_create_subscriber that 
addresses this.

> We then intend to take a dump, restore the table, and play back 
> the changes by enabling the Subscription. This way the bulk data 
> transfer is out of band from the production application. Our testing 
> shows that this works cleanly, and that new changes replicate correctly 
> to the target relations. Additionally, pg_dump has a —snapshot flag 
> which it appears was added to support this sort of workflow.

Yes, this would also be a valid solution.