Обсуждение: Fastest option to transfer db?

Поиск
Список
Период
Сортировка

Fastest option to transfer db?

От
Israel Brewster
Дата:
I have a database that I want to transfer from one VM to another. Both VM’s are running on the same (ProxMox) host. select pg_size_pretty(pg_database_size(‘dbname')); shows the database size to be 336GB. What is going to be the fastest method to transfer this data?

- The database cluster has other databases on it, so taking the cluster offline and copying files is not an option
- The specific database I want to transfer is live, receiving data on an ongoing basis. Some downtime is acceptable, days of downtime not so much.

My first attempt was to run the following command from the destination machine:

pg_dump -C -h source.machine.address dbname | psql -h 127.0.0.1 postgres

This command pegged a single core on both machines, and produced network activity of around 30M (according to the proxmox monitor). By my calculations that implies that the transfer should take around 25 hours - not really good enough, and just feels sluggish, given that even just a gig connection should be able to transfer data at near gig speeds - obviously there is a bottleneck somewhere in the system/command.

For my next attempt, I tried the following command from the SOURCE machine:

pg_dump -Fc -Cc -U israel dbname | pg_restore -U israel -h dest.machine.address -d postgres

This resulted in 100% CPU usage on the source machine by pg_dump, around 50% cpu usage on the source by postmaster, and around 30-50% cpu usage on the destination by postgres (I think postmaster vs postgres process name difference is due to CentOS vs Ubuntu? Not sure. Probably doesn’t matter.) Network throughput dropped to only 14M under this scenario, but of course that is compressed data, so time calculations are hard.

During both attempts I saw no disk activity on the destination host (according to proxmox monitoring), nor did the RAM usage increase, so I’m not sure where the data is actually going, which bothers me.

Is there a better way to do this?

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

Re: Fastest option to transfer db?

От
Michael Lewis
Дата:
What version of Postgres is the source? Can you make use of logical replication?

Re: Fastest option to transfer db?

От
Israel Brewster
Дата:
Source DB is 11.12, destination is 13.4. I’ll look into logical replication - It sounds like it could be a good option. Thanks!

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

On Sep 13, 2021, at 10:10 AM, Michael Lewis <mlewis@entrata.com> wrote:

What version of Postgres is the source? Can you make use of logical replication?

Re: Fastest option to transfer db?

От
Israel Brewster
Дата:
Ok, I have logical replication up-and-running (I guess - seemed to simple to be working. Shouldn’t it be complicated, requiring many steps and configuration changes?), maxing out one CPU core on each machine (more or less), and showing network throughput of around 15M. If DU changes are to be believed, it’s transferring data at about 0.8GB/minute, implying something like a 8 hour transfer time.

Of course, since it is replication, it has the benefit that any data that comes in during that 8 hour window should also be replicated, after which the two systems should remain in sync allowing for zero (or nearly so) downtime cutover. Which is nice.

Any gotchas I need to be aware of during this initial transfer window, such as WAL files building up on the source machine?

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145

On Sep 13, 2021, at 10:10 AM, Michael Lewis <mlewis@entrata.com> wrote:

What version of Postgres is the source? Can you make use of logical replication?

Re: Fastest option to transfer db?

От
Tomas Pospisek
Дата:
I'm potentiall facing the same problem and would be interested in the 
solution. Is there any particular howto you followed?

Also at some point I'd like to cut of the link between the two DBs 
promote the copy to be the master and delete the original DB. Have you 
figured out the correct step for the cut-over to happen?
*t

On 13.09.21 23:10, Israel Brewster wrote:
> Ok, I have logical replication up-and-running (I guess - seemed to 
> simple to be working. Shouldn’t it be complicated, requiring many steps 
> and configuration changes?), maxing out one CPU core on each machine 
> (more or less), and showing network throughput of around 15M. If DU 
> changes are to be believed, it’s transferring data at about 
> 0.8GB/minute, implying something like a 8 hour transfer time.
> 
> Of course, since it is replication, it has the benefit that any data 
> that comes in during that 8 hour window should also be replicated, after 
> which the two systems should remain in sync allowing for zero (or nearly 
> so) downtime cutover. Which is nice.
> 
> Any gotchas I need to be aware of during this initial transfer window, 
> such as WAL files building up on the source machine?
> 
> ---
> Israel Brewster
> Software Engineer
> Alaska Volcano Observatory
> Geophysical Institute - UAF
> 2156 Koyukuk Drive
> Fairbanks AK 99775-7320
> Work: 907-474-5172
> cell:  907-328-9145
> 
>> On Sep 13, 2021, at 10:10 AM, Michael Lewis <mlewis@entrata.com 
>> <mailto:mlewis@entrata.com>> wrote:
>>
>> What version of Postgres is the source? Can you make use of logical 
>> replication?
> 




Re: Fastest option to transfer db?

От
Israel Brewster
Дата:
On Sep 14, 2021, at 1:42 AM, Tomas Pospisek <tpo2@sourcepole.ch> wrote:

I'm potentiall facing the same problem and would be interested in the solution. Is there any particular howto you followed?

I used https://hevodata.com/learn/postgresql-logical-replication/, but I rather suspect there are better options, as I had a number of issues with that one. Essentially, it seems to boil down to the following steps:

1) Set up a new “slave” database. I used a pg_dump -s command piped through to psql to transfer schema only to the new DB
2) Configure both master and slave postgresql servers with wal_level=logical in the postgresql.conf file. Will require a server restart, which stinks if you need absolutely 0 downtime, but ideally should only take a few seconds.
3) On the master server, run the SQL command 

CREATE PUBLICATION publication_name FOR ALL TABLES; 

Presumably this should be issued while in the database you wish to replicate, and you can specify specific tables only if desired (though I haven’t looked into how). There is a note in that how-to about the published table needing a REPLICA IDENTITY to replicate DELETE and UPDATE operations, I still need to look into that more. For my case, however, I have an insert-only workflow, so it wasn’t an issue.

4) Finally, on the slave server, run the command 

CREATE SUBSCRIPTION subscription_name CONNECTION '...<connection string> ...' PUBLICATION my_publication;

where connection string is something like:

‘host=my.remote.host port=5432 dbname=my_big_db'

...And that’s it, replication should kick off by syncing up the data from master to slave, and after that changes should be sent in real time.

One gotcha that I ran into: my table used the postgis extension, which was set up on the slave DB when I ran the pg_dump commands. Creating that extension creates *and populates* a table named “spatial_ref_sys”. Since I had created the publication as “FOR ALL TABLES”, the replication tried to include this table. Which failed due to UNIQUE constraint violations. At which point replication stopped AS DID ALL INSERTS INTO THE MASTER DB!

I deleted all records from said table on the slave DB, replication kicked in and re-populated it, and life went on, but I lost something like 14 hours of data thanks to this mistake. Guess that’s what I get for not keeping an eye on the process :-( Lesson learned: make sure any tables to be replicated are empty in the slave before issuing the CREATE SUBSCRIPTION command.

For cutover, I plan the following, we’ll see how it goes:

1) move all read-only processes over to the new server. That can be done at any point after the above, with little or no downtime depending on your processes.
2) Reconfigure writing process to point to the slave, and stop them
3) Issue a DROP SUBSCRIPTION subscription_name on the slave, thereby “promoting” it to master
4) Restart writing processes, now pointed to the new master

Steps 2-4 should happen in quick succession, resulting in only seconds of downtime. At least, that’s the theory. We’ll see how it goes (or if anyone else here can offer a better procedure!)

---
Israel Brewster
Software Engineer
Alaska Volcano Observatory 
Geophysical Institute - UAF 
2156 Koyukuk Drive 
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145



Also at some point I'd like to cut of the link between the two DBs promote the copy to be the master and delete the original DB. Have you figured out the correct step for the cut-over to happen?
*t

On 13.09.21 23:10, Israel Brewster wrote:
Ok, I have logical replication up-and-running (I guess - seemed to simple to be working. Shouldn’t it be complicated, requiring many steps and configuration changes?), maxing out one CPU core on each machine (more or less), and showing network throughput of around 15M. If DU changes are to be believed, it’s transferring data at about 0.8GB/minute, implying something like a 8 hour transfer time.
Of course, since it is replication, it has the benefit that any data that comes in during that 8 hour window should also be replicated, after which the two systems should remain in sync allowing for zero (or nearly so) downtime cutover. Which is nice.
Any gotchas I need to be aware of during this initial transfer window, such as WAL files building up on the source machine?
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell:  907-328-9145
On Sep 13, 2021, at 10:10 AM, Michael Lewis <mlewis@entrata.com <mailto:mlewis@entrata.com>> wrote:

What version of Postgres is the source? Can you make use of logical replication?