Обсуждение: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

Поиск
Список
Период
Сортировка

2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Patrick B
Дата:
Hi guys,

I'll be performing a migration on my production master database server, which is running PostgreSQL 9.2 atm, from SATA disks to SSD disks.
I've got some questions about it, and it would be nice if u guys could share your experiences/thoughts:

SCENARIO:

I currently have one MASTER and two Streaming Replication Slaves servers...

master01
slave01 (Streaming replication + wal_files)
slave02 (Streaming replication + wal_files)

...Postgres is mounted on: /var/lib/pgsql/... The SSD disks will be installed only on my Master server, because my main problem is Writes and not reads.

The new SSD volume will be mounted on /var/lib/pgsql2/


  • The slave02 server will loose the streaming replication connection to the master, once slave01 becomes the new master a new timeline will be settled? Will slave02 be able to connect to the slave01 server for streaming replication?



MIGRATION OPTIONS:

Migration Option 1: I know this option will work

  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Turn slave01 into a master server
  3. once I can confirm everything is working fine, I can go to step 4
  4. Stop postgres on the master01, start copying the DB using pg_basebackup from slave02 to master01 (Will have to edit postgres to use /var/lib/pgsql2/ instead /var/lib/pgsql - Is that possible? Or I'd have to create a symbolic link?)
  5. Start postgres on master01 server and check if all goes well as streaming replication server (Will test it for days)
  6. Turn master01 into a master server and I'll have to re-copy the DB into slave01 to make it a streaming replication server again

Migration Option 2: I don't know if this is possible - IS THIS POSSIBLE????
  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Stop postgres on the server (I won't stop postgres on the slave so the users will be able to use the server as read-only)
  3. Copy the data from /var/lib/pgsql/ to /var/lib/pgsql2/
  4. Configure postgres to start using the new volume(/var/lib/pgsql2/)

What do you guys think? Is option possible? if so it would be much easier :)
Thanks!

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Venkata B Nagothi
Дата:

On Fri, Sep 2, 2016 at 12:48 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,

I'll be performing a migration on my production master database server, which is running PostgreSQL 9.2 atm, from SATA disks to SSD disks.
I've got some questions about it, and it would be nice if u guys could share your experiences/thoughts:

SCENARIO:

I currently have one MASTER and two Streaming Replication Slaves servers...

master01
slave01 (Streaming replication + wal_files)
slave02 (Streaming replication + wal_files)

...Postgres is mounted on: /var/lib/pgsql/... The SSD disks will be installed only on my Master server, because my main problem is Writes and not reads.

The new SSD volume will be mounted on /var/lib/pgsql2/


  • The slave02 server will loose the streaming replication connection to the master, once slave01 becomes the new master a new timeline will be settled? Will slave02 be able to connect to the slave01 server for streaming replication?
Yes, slave01 becomes new master with a new timeline id. Cascading replication is supported in 9.2, but, the dependency on WAL archives is a bit heavy. You need to ensure .history file is copied over to slave02. I think, you have WAL archiving enabled, so, should be fine.

MIGRATION OPTIONS:

Migration Option 1: I know this option will work

  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Turn slave01 into a master server
  3. once I can confirm everything is working fine, I can go to step 4
  4. Stop postgres on the master01, start copying the DB using pg_basebackup from slave02 to master01 (Will have to edit postgres to use /var/lib/pgsql2/ instead /var/lib/pgsql - Is that possible? Or I'd have to create a symbolic link?)
  5. Start postgres on master01 server and check if all goes well as streaming replication server (Will test it for days)
  6. Turn master01 into a master server and I'll have to re-copy the DB into slave01 to make it a streaming replication server again
@ Step 4, you can consider making master01 slave directly by building a new recovery.conf file and copying over slave02's history file by doing which, you can avoid re-build streaming replication from scratch. 
When you "edit postgres", did you mean changing postgresql.conf ? if yes, changing the parameter in postgresql.conf to use the new location should not be a problem.

@ Step 6, Once you turn master01 (new slave) back to master server, you can consider making slave01 (new master) a slave again by copying over the .history files and required WALs. You do not have to build replication from scratch.

Migration Option 2: I don't know if this is possible - IS THIS POSSIBLE????
  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Stop postgres on the server (I won't stop postgres on the slave so the users will be able to use the server as read-only)
  3. Copy the data from /var/lib/pgsql/ to /var/lib/pgsql2/
  4. Configure postgres to start using the new volume(/var/lib/pgsql2/)
 This looks pretty straight forward. The only issue would be that, users will not be able to do writes. If you are bringing down master and starting up again, it should not a problem, slaves should be able catch up again.

What do you guys think? Is option possible? if so it would be much easier :)

Well, both the options work based on your expectations, Application requirements on downtime, SLAs etc. 

Regards,
Venkata B N

Fujitsu Australia

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Patrick B
Дата:


2016-09-02 15:36 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>:

On Fri, Sep 2, 2016 at 12:48 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,

I'll be performing a migration on my production master database server, which is running PostgreSQL 9.2 atm, from SATA disks to SSD disks.
I've got some questions about it, and it would be nice if u guys could share your experiences/thoughts:

SCENARIO:

I currently have one MASTER and two Streaming Replication Slaves servers...

master01
slave01 (Streaming replication + wal_files)
slave02 (Streaming replication + wal_files)

...Postgres is mounted on: /var/lib/pgsql/... The SSD disks will be installed only on my Master server, because my main problem is Writes and not reads.

The new SSD volume will be mounted on /var/lib/pgsql2/


  • The slave02 server will loose the streaming replication connection to the master, once slave01 becomes the new master a new timeline will be settled? Will slave02 be able to connect to the slave01 server for streaming replication?
Yes, slave01 becomes new master with a new timeline id. Cascading replication is supported in 9.2, but, the dependency on WAL archives is a bit heavy. You need to ensure .history file is copied over to slave02. I think, you have WAL archiving enabled, so, should be fine.


hmmm.... is the .history files located into pg_xlog? I can't see none.... are they only generated when a new timeline id is created?
If so, I think it will be fine as they're already being shipped to the slaves

 

MIGRATION OPTIONS:

Migration Option 1: I know this option will work

  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Turn slave01 into a master server
  3. once I can confirm everything is working fine, I can go to step 4
  4. Stop postgres on the master01, start copying the DB using pg_basebackup from slave02 to master01 (Will have to edit postgres to use /var/lib/pgsql2/ instead /var/lib/pgsql - Is that possible? Or I'd have to create a symbolic link?)
  5. Start postgres on master01 server and check if all goes well as streaming replication server (Will test it for days)
  6. Turn master01 into a master server and I'll have to re-copy the DB into slave01 to make it a streaming replication server again
@ Step 4, you can consider making master01 slave directly by building a new recovery.conf file and copying over slave02's history file by doing which, you can avoid re-build streaming replication from scratch. 
When you "edit postgres", did you mean changing postgresql.conf ? if yes, changing the parameter in postgresql.conf to use the new location should not be a problem.

@ Step 6, Once you turn master01 (new slave) back to master server, you can consider making slave01 (new master) a slave again by copying over the .history files and required WALs. You do not have to build replication from scratch.

Migration Option 2: I don't know if this is possible - IS THIS POSSIBLE????
  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Stop postgres on the server (I won't stop postgres on the slave so the users will be able to use the server as read-only)
  3. Copy the data from /var/lib/pgsql/ to /var/lib/pgsql2/
  4. Configure postgres to start using the new volume(/var/lib/pgsql2/)
 This looks pretty straight forward. The only issue would be that, users will not be able to do writes. If you are bringing down master and starting up again, it should not a problem, slaves should be able catch up again.

What do you guys think? Is option possible? if so it would be much easier :)

Well, both the options work based on your expectations, Application requirements on downtime, SLAs etc. 


So is that really possible? Just copy the data between folders? if so, i'll probably chose option 2!!!
Even that is 2.5TB I don't think the copy will take longer than 20 minutes... and I'd still be able to perform reads...

I'll do some test to see if option 2 can be done :)

Thanks !!! :D

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
David Gibbons
Дата:

So is that really possible? Just copy the data between folders? if so, i'll probably chose option 2!!!
Even that is 2.5TB I don't think the copy will take longer than 20 minutes... and I'd still be able to perform reads...

I'll do some test to see if option 2 can be done :)

Thanks !!! :D


 
You can actually reduce the time more by pre-syncing to the new location. 
something like:

rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
service postgres stop
rsync -va /var/lib/pgsql/ /var/lib/pgsql2/

The second rsync will only copy the deltas from the first, it still has to go in and determine what needs to be copied/what changed but the bulk of it can be prepared/migrated before the actual downtime window.


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
David Gibbons
Дата:

You can reduce the time much further by pre copying the files. Then during the maintenance window only copy the deltas basically.


On Sep 1, 2016 9:43 PM, "Patrick B" <patrickbakerbr@gmail.com> wrote:


2016-09-02 15:36 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>:

On Fri, Sep 2, 2016 at 12:48 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,

I'll be performing a migration on my production master database server, which is running PostgreSQL 9.2 atm, from SATA disks to SSD disks.
I've got some questions about it, and it would be nice if u guys could share your experiences/thoughts:

SCENARIO:

I currently have one MASTER and two Streaming Replication Slaves servers...

master01
slave01 (Streaming replication + wal_files)
slave02 (Streaming replication + wal_files)

...Postgres is mounted on: /var/lib/pgsql/... The SSD disks will be installed only on my Master server, because my main problem is Writes and not reads.

The new SSD volume will be mounted on /var/lib/pgsql2/


  • The slave02 server will loose the streaming replication connection to the master, once slave01 becomes the new master a new timeline will be settled? Will slave02 be able to connect to the slave01 server for streaming replication?
Yes, slave01 becomes new master with a new timeline id. Cascading replication is supported in 9.2, but, the dependency on WAL archives is a bit heavy. You need to ensure .history file is copied over to slave02. I think, you have WAL archiving enabled, so, should be fine.


hmmm.... is the .history files located into pg_xlog? I can't see none.... are they only generated when a new timeline id is created?
If so, I think it will be fine as they're already being shipped to the slaves

 

MIGRATION OPTIONS:

Migration Option 1: I know this option will work

  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Turn slave01 into a master server
  3. once I can confirm everything is working fine, I can go to step 4
  4. Stop postgres on the master01, start copying the DB using pg_basebackup from slave02 to master01 (Will have to edit postgres to use /var/lib/pgsql2/ instead /var/lib/pgsql - Is that possible? Or I'd have to create a symbolic link?)
  5. Start postgres on master01 server and check if all goes well as streaming replication server (Will test it for days)
  6. Turn master01 into a master server and I'll have to re-copy the DB into slave01 to make it a streaming replication server again
@ Step 4, you can consider making master01 slave directly by building a new recovery.conf file and copying over slave02's history file by doing which, you can avoid re-build streaming replication from scratch. 
When you "edit postgres", did you mean changing postgresql.conf ? if yes, changing the parameter in postgresql.conf to use the new location should not be a problem.

@ Step 6, Once you turn master01 (new slave) back to master server, you can consider making slave01 (new master) a slave again by copying over the .history files and required WALs. You do not have to build replication from scratch.

Migration Option 2: I don't know if this is possible - IS THIS POSSIBLE????
  1. Mount the new volume /var/lib/pgsql2/ on the master01 server
  2. Stop postgres on the server (I won't stop postgres on the slave so the users will be able to use the server as read-only)
  3. Copy the data from /var/lib/pgsql/ to /var/lib/pgsql2/
  4. Configure postgres to start using the new volume(/var/lib/pgsql2/)
 This looks pretty straight forward. The only issue would be that, users will not be able to do writes. If you are bringing down master and starting up again, it should not a problem, slaves should be able catch up again.

What do you guys think? Is option possible? if so it would be much easier :)

Well, both the options work based on your expectations, Application requirements on downtime, SLAs etc. 


So is that really possible? Just copy the data between folders? if so, i'll probably chose option 2!!!
Even that is 2.5TB I don't think the copy will take longer than 20 minutes... and I'd still be able to perform reads...

I'll do some test to see if option 2 can be done :)

Thanks !!! :D

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Steve Crawford
Дата:
...
 
You can actually reduce the time more by pre-syncing to the new location. 
something like:

rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
service postgres stop
rsync -va /var/lib/pgsql/ /var/lib/pgsql2/

The second rsync will only copy the deltas from the first, it still has to go in and determine what needs to be copied/what changed but the bulk of it can be prepared/migrated before the actual downtime window.



The benefit of an initial and final rsync will depend on how many files change. Rsync's default when copying between local paths is to use the --whole-file option so at least it won't busy itself reading and comparing the source and destination files which is worse than simply copying the entire thing but you will only save the time associated with those files that have unchanged modification time and size between the first and second rsync. If the initial rsync takes, say, a half hour it is potentially beneficial to run a second or even additional preliminary rsync runs as each additional run should be faster due to less time for files to change during the rsync. You will have to test for your specific case.

If you *really* want to do a fast switch and your configuration disk configuration supports it you could possibly play games with using single-machine DBRD or LVM RAID to live-sync the old and new directories. Of course it's equally possible that the setup involved to do this will involve more initial downtime than just copying the files.

Another possibility is to set up an additional slave instance of PostgreSQL on your master machine then cut over to that instance. I haven't though through the issue of bringing up your actual slave servers after the cutover. I suspect in the worse case you would have your current master instance, your replica instance running on the master server and using the new SSD then migrate the slave servers to cascade off the master server's replica instance. When all is synched up, promote the master server replica instance to a master and kill off the original master instance.

As always in these instance, testing and practice is mandatory.

Cheers,
Steve



Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Scott Marlowe
Дата:
On Thu, Sep 1, 2016 at 8:48 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
> Hi guys,
>
> I'll be performing a migration on my production master database server,
> which is running PostgreSQL 9.2 atm, from SATA disks to SSD disks.
> I've got some questions about it, and it would be nice if u guys could share
> your experiences/thoughts:
>
> SCENARIO:
>
> I currently have one MASTER and two Streaming Replication Slaves servers...
>
>> master01
>> slave01 (Streaming replication + wal_files)
>> slave02 (Streaming replication + wal_files)
>
>
> ...Postgres is mounted on: /var/lib/pgsql/... The SSD disks will be
> installed only on my Master server, because my main problem is Writes and
> not reads.
>
> The new SSD volume will be mounted on /var/lib/pgsql2/
>
>
> The slave02 server will loose the streaming replication connection to the
> master, once slave01 becomes the new master a new timeline will be settled?
> Will slave02 be able to connect to the slave01 server for streaming
> replication?
>
>
>
>
> MIGRATION OPTIONS:
>
> Migration Option 1: I know this option will work
>
> Mount the new volume /var/lib/pgsql2/ on the master01 server
> Turn slave01 into a master server
> once I can confirm everything is working fine, I can go to step 4
> Stop postgres on the master01, start copying the DB using pg_basebackup from
> slave02 to master01 (Will have to edit postgres to use /var/lib/pgsql2/
> instead /var/lib/pgsql - Is that possible? Or I'd have to create a symbolic
> link?)
> Start postgres on master01 server and check if all goes well as streaming
> replication server (Will test it for days)
> Turn master01 into a master server and I'll have to re-copy the DB into
> slave01 to make it a streaming replication server again
>
>
> Migration Option 2: I don't know if this is possible - IS THIS POSSIBLE????
>
> Mount the new volume /var/lib/pgsql2/ on the master01 server
> Stop postgres on the server (I won't stop postgres on the slave so the users
> will be able to use the server as read-only)
> Copy the data from /var/lib/pgsql/ to /var/lib/pgsql2/
> Configure postgres to start using the new volume(/var/lib/pgsql2/)
>
>
> What do you guys think? Is option possible? if so it would be much easier :)
> Thanks!

Why not just subscribe to another cluster on the master, then sub the
slaves to that, then switchover to the new cluster on the master?

--
To understand recursion, one must first understand recursion.


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Venkata B Nagothi
Дата:

On Fri, Sep 2, 2016 at 2:40 PM, Patrick B <patrickbakerbr@gmail.com> wrote:


2016-09-02 15:36 GMT+12:00 Venkata B Nagothi <nag1010@gmail.com>:

On Fri, Sep 2, 2016 at 12:48 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
Hi guys,

I'll be performing a migration on my production master database server, which is running PostgreSQL 9.2 atm, from SATA disks to SSD disks.
I've got some questions about it, and it would be nice if u guys could share your experiences/thoughts:

SCENARIO:

I currently have one MASTER and two Streaming Replication Slaves servers...

master01
slave01 (Streaming replication + wal_files)
slave02 (Streaming replication + wal_files)

...Postgres is mounted on: /var/lib/pgsql/... The SSD disks will be installed only on my Master server, because my main problem is Writes and not reads.

The new SSD volume will be mounted on /var/lib/pgsql2/


  • The slave02 server will loose the streaming replication connection to the master, once slave01 becomes the new master a new timeline will be settled? Will slave02 be able to connect to the slave01 server for streaming replication?
Yes, slave01 becomes new master with a new timeline id. Cascading replication is supported in 9.2, but, the dependency on WAL archives is a bit heavy. You need to ensure .history file is copied over to slave02. I think, you have WAL archiving enabled, so, should be fine.


hmmm.... is the .history files located into pg_xlog? I can't see none.... are they only generated when a new timeline id is created?
If so, I think it will be fine as they're already being shipped to the slaves

No. The issue in 9.2 ( or rather until version 9.2) is, the WAL generated after the master is shutdown (which is supposed to have the last known status of master) is not automatically transferred to slave which is very important when you are swapping over master-slave roles. You can only do that manually and there is no way postgres does it automatically and same would be the issue when attempt to make master a new slave (without building slave from scratch). When you promote slave .history file gets generated with a new timeline id, which you need to make master a new slave. All of this is fixed in 9.3.

I wanted to stress on this to point out that master-slave roles can be reversed without actually needing to build replication from scratch and is trickier in the versions 9.2 and earlier.

Anyways, this may not be that important now as you confirmed that you are going for option 2

Regards,
Venkata B N

Fujitsu Australia

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Patrick B
Дата:
Hi guys,


You can actually reduce the time more by pre-syncing to the new location. 
something like:
rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
service postgres stop
rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
The second rsync will only copy the deltas from the first, it still has to go in and determine what needs to be copied/what changed but the bulk of it can be prepared/migrated before the actual downtime window.


Thanks a lot @David.

I tested the above and it worked... Here is what I did:

1 - Create /var/lib/pgsql2 directory
mkdir /var/lib/pgsql2

2 - Set permissions:
chown -R postgres:postgres /var/lib/pgsql2

3 - Change postgresql.conf:
data_directory = '/var/lib/pgsql2/9.2/data/'

4 - RSYNC:
rsync -va /var/lib/pgsql/ /var/lib/pgsql2/

5 - Stop postgres
/etc/init.d/postgresql-9.2 stop

5 - Re-run RSYNC to incremental copy:
rsync -va /var/lib/pgsql/ /var/lib/pgsql2/

6 - Change /etc/init.d/postgresql-9.2:

OLD:
PGDATA=/var/lib/pgsql/9.2/data
PGLOG=/var/lib/pgsql/9.2/pgstartup.log
PGUPLOG=/var/lib/pgsql/$PGMAJORVERSION/pgupgrade.log

NEW:
PGDATA=/var/lib/pgsql2/9.2/data
PGLOG=/var/lib/pgsql2/9.2/pgstartup.log
PGUPLOG=/var/lib/pgsql2/$PGMAJORVERSION/pgupgrade.log

7 - Start postgres
/etc/init.d/postgresql-9.2 start


and it worked perfectly fine... :)

That's great news! My only concern is about the "RSYNC" - Hope that doesn't take long!!!

This all steps must be performed by me on the next few days/weeks - I'll keep you guys updated... Keen to see the new DB running in a SSD environment :)

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Vick Khera
Дата:
On Sun, Sep 4, 2016 at 4:37 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
> That's great news! My only concern is about the "RSYNC" - Hope that doesn't
> take long!!!
>
> This all steps must be performed by me on the next few days/weeks - I'll
> keep you guys updated... Keen to see the new DB running in a SSD environment
> :)

That will depend on your database change velocity. If the initial copy
takes a long time, Re-run the rsync again just before the shut down.
Try to pick the lowest load time.

For sure rsync can take a long time. I've also done similar moves
using ZFS snapshots, which are crazy fast to sync... but that assumes
you're using ZFS already on the current data directory.


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Jim Nasby
Дата:
On 9/2/16 11:44 AM, David Gibbons wrote:
> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
> service postgres stop
> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
>
> The second rsync will only copy the deltas from the first, it still has
> to go in and determine what needs to be copied/what changed but the bulk
> of it can be prepared/migrated before the actual downtime window.

That is NOT safe. The problem is it allows rsync to use mtime alone to
decide that a file is in sync, and that will fail if Postgres writes to
a file in the same second that the first rsync reads from it (assuming
Postgres writes after rsync reads). You need to add the --checksum flag
to rsync (which means it will still have to read everything that's in
/var/lib/pgsql).
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Jim Nasby
Дата:
Please include the mailing list in replies...

On 9/7/16 6:10 PM, David Gibbons wrote:
>     That is NOT safe. The problem is it allows rsync to use mtime alone
>     to decide that a file is in sync, and that will fail if Postgres
>     writes to a file in the same second that the first rsync reads from
>     it (assuming Postgres writes after rsync reads). You need to add the
>     --checksum flag to rsync (which means it will still have to read
>     everything that's in /var/lib/pgsql).
>
>
> The checksum flag as you mention is not performant,

Definitely not. :/

> If this is a concern, you're much better using the *--modify-window *flag:
> When comparing two timestamps, rsync treats the timestamps as being
> equal if they differ by no more than the modify-window value. This is
> normally 0 (for an exact match), but you may find it useful to set this
> to a larger value in some situations.
>
> Hence, rsync -va --modify-window=1 would remove your concern about a
> same second race condition without forcing the sync to read through all
> the files.

Very interesting and useful!
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Patrick B
Дата:


2016-09-08 11:49 GMT+12:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
Please include the mailing list in replies...

On 9/7/16 6:10 PM, David Gibbons wrote:
    That is NOT safe. The problem is it allows rsync to use mtime alone
    to decide that a file is in sync, and that will fail if Postgres
    writes to a file in the same second that the first rsync reads from
    it (assuming Postgres writes after rsync reads). You need to add the
    --checksum flag to rsync (which means it will still have to read
    everything that's in /var/lib/pgsql).


The checksum flag as you mention is not performant,

Definitely not. :/

If this is a concern, you're much better using the *--modify-window *flag:
When comparing two timestamps, rsync treats the timestamps as being
equal if they differ by no more than the modify-window value. This is
normally 0 (for an exact match), but you may find it useful to set this
to a larger value in some situations.

Hence, rsync -va --modify-window=1 would remove your concern about a
same second race condition without forcing the sync to read through all
the files.

Very interesting and useful!

Cool! I'll use the rsync -va --modify-window=1 instead.

Thanks!
Patrick 

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Scott Marlowe
Дата:
On Wed, Sep 7, 2016 at 5:00 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
> On 9/2/16 11:44 AM, David Gibbons wrote:
>>
>> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
>> service postgres stop
>> rsync -va /var/lib/pgsql/ /var/lib/pgsql2/
>>
>> The second rsync will only copy the deltas from the first, it still has
>> to go in and determine what needs to be copied/what changed but the bulk
>> of it can be prepared/migrated before the actual downtime window.
>
>
> That is NOT safe. The problem is it allows rsync to use mtime alone to
> decide that a file is in sync, and that will fail if Postgres writes to a
> file in the same second that the first rsync reads from it (assuming
> Postgres writes after rsync reads). You need to add the --checksum flag to
> rsync (which means it will still have to read everything that's in
> /var/lib/pgsql).
> --

I'm still wondering why my advice to just subscribe a new cluster on
the master machine was just ignored by OP. Postgresql already has a
pretty reliable method for doing what the OP wants using
pg_basebackup. Using rsync etc is like reinventing the wheel imho.

--
To understand recursion, one must first understand recursion.


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Jeff Janes
Дата:
On Wed, Sep 7, 2016 at 4:49 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 9/7/16 6:10 PM, David Gibbons wrote:
    That is NOT safe. The problem is it allows rsync to use mtime alone
    to decide that a file is in sync, and that will fail if Postgres
    writes to a file in the same second that the first rsync reads from
    it (assuming Postgres writes after rsync reads). You need to add the
    --checksum flag to rsync (which means it will still have to read
    everything that's in /var/lib/pgsql).


The checksum flag as you mention is not performant,

Definitely not. :/

If this is a concern, you're much better using the *--modify-window *flag:
When comparing two timestamps, rsync treats the timestamps as being
equal if they differ by no more than the modify-window value. This is
normally 0 (for an exact match), but you may find it useful to set this
to a larger value in some situations.

Hence, rsync -va --modify-window=1 would remove your concern about a
same second race condition without forcing the sync to read through all
the files.

Very interesting and useful!

Isn't this heading in the wrong direction?   We need to be more precise than 0 (since 0 is computed off of rounded/truncated time stamps), not less precise than 0.

Cheers,

Jeff

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
David Gibbons
Дата:

Isn't this heading in the wrong direction?   We need to be more precise than 0 (since 0 is computed off of rounded/truncated time stamps), not less precise than 0.

Cheers,

Jeff


Hmm, You may be right, reading it 4 more times for comprehension it looks like it should be set to -1 not 1. 

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Jim Nasby
Дата:
On 9/8/16 3:29 PM, David Gibbons wrote:
>
>     Isn't this heading in the wrong direction?   We need to be more
>     precise than 0 (since 0 is computed off of rounded/truncated time
>     stamps), not less precise than 0.
>
>     Cheers,
>
>     Jeff
>
>
>
> Hmm, You may be right, reading it 4 more times for comprehension it
> looks like it should be set to -1 not 1.

Not according to my man page:

        --modify-window
               When comparing two timestamps, rsync treats the
timestamps as being equal if they differ by no more than the
modify-window value.  This is normally 0 (for an exact match), but you
               may find it useful to set this to a larger value in some
situations.  In particular, when transferring to or from an MS Windows
FAT  filesystem  (which  represents  times  with  a
               2-second resolution), --modify-window=1 is useful
(allowing times to differ by up to 1 second).

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Patrick B
Дата:


2016-09-11 14:09 GMT+12:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
On 9/8/16 3:29 PM, David Gibbons wrote:

    Isn't this heading in the wrong direction?   We need to be more
    precise than 0 (since 0 is computed off of rounded/truncated time
    stamps), not less precise than 0.

    Cheers,

    Jeff



Hmm, You may be right, reading it 4 more times for comprehension it
looks like it should be set to -1 not 1.

Not according to my man page:

       --modify-window
              When comparing two timestamps, rsync treats the timestamps as being equal if they differ by no more than the modify-window value.  This is normally 0 (for an exact match), but you
              may find it useful to set this to a larger value in some situations.  In particular, when transferring to or from an MS Windows FAT  filesystem  (which  represents  times  with  a
              2-second resolution), --modify-window=1 is useful (allowing times to differ by up to 1 second).


--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




So... what do u guys recommend? which options should I use?

Patrick

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Scott Marlowe
Дата:
On Sun, Sep 11, 2016 at 3:26 AM, Patrick B <patrickbakerbr@gmail.com> wrote:
>
>
> 2016-09-11 14:09 GMT+12:00 Jim Nasby <Jim.Nasby@bluetreble.com>:
>>
>> On 9/8/16 3:29 PM, David Gibbons wrote:
>>>
>>>
>>>     Isn't this heading in the wrong direction?   We need to be more
>>>     precise than 0 (since 0 is computed off of rounded/truncated time
>>>     stamps), not less precise than 0.
>>>
>>>     Cheers,
>>>
>>>     Jeff
>>>
>>>
>>>
>>> Hmm, You may be right, reading it 4 more times for comprehension it
>>> looks like it should be set to -1 not 1.
>>
>>
>> Not according to my man page:
>>
>>        --modify-window
>>               When comparing two timestamps, rsync treats the timestamps
>> as being equal if they differ by no more than the modify-window value.  This
>> is normally 0 (for an exact match), but you
>>               may find it useful to set this to a larger value in some
>> situations.  In particular, when transferring to or from an MS Windows FAT
>> filesystem  (which  represents  times  with  a
>>               2-second resolution), --modify-window=1 is useful (allowing
>> times to differ by up to 1 second).
>>
>>
>> --
>> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
>> Experts in Analytics, Data Architecture and PostgreSQL
>> Data in Trouble? Get it in Treble! http://BlueTreble.com
>> 855-TREBLE2 (855-873-2532)   mobile: 512-569-9461
>>
>>
>> --
>> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-general
>
>
>
>
>
> So... what do u guys recommend? which options should I use?
>
> Patrick

Why not subscribe a new cluster on the same box with pg_basebackup?


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Jeff Janes
Дата:

On Sep 12, 2016 1:12 AM, "Scott Marlowe" <scott.marlowe@gmail.com> wrote:
>
>
>
> Why not subscribe a new cluster on the same box with pg_basebackup?

+1.

Maybe he is afraid of (or doesn't know how to) configuring things to run on a non standard port, for testing?

Cheers,

Jeff

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Patrick B
Дата:


2016-09-13 0:06 GMT+12:00 Jeff Janes <jeff.janes@gmail.com>:

On Sep 12, 2016 1:12 AM, "Scott Marlowe" <scott.marlowe@gmail.com> wrote:
>
>
>
> Why not subscribe a new cluster on the same box with pg_basebackup?

+1.

Maybe he is afraid of (or doesn't know how to) configuring things to run on a non standard port, for testing?


Yes... I do know how to do it... This thread was going in a different way and nobody included me has talked about it.. 

and I wouldn't like to "waste" time in that, that's why I've asked if you could help with rsync

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Jeff Janes
Дата:
On Sat, Sep 10, 2016 at 7:09 PM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 9/8/16 3:29 PM, David Gibbons wrote:

    Isn't this heading in the wrong direction?   We need to be more
    precise than 0 (since 0 is computed off of rounded/truncated time
    stamps), not less precise than 0.

    Cheers,

    Jeff



Hmm, You may be right, reading it 4 more times for comprehension it
looks like it should be set to -1 not 1.

Not according to my man page:

       --modify-window
              When comparing two timestamps, rsync treats the timestamps as being equal if they differ by no more than the modify-window value.  This is normally 0 (for an exact match), but you
              may find it useful to set this to a larger value in some situations.  In particular, when transferring to or from an MS Windows FAT  filesystem  (which  represents  times  with  a
              2-second resolution), --modify-window=1 is useful (allowing times to differ by up to 1 second).


Sorry, I can't tell what you are disputing here.

The man page you quote seems clear to me that setting it to 1, rather than leaving it at 0, makes the opportunity for corruption wider, not narrower.

I thought that David's "-1" suggestions was tongue in cheek.  But it turns out that that actually does work.  Of course, it works by forcing every file to be copied, which removes the point of using this over pg_basebackup, but nonetheless it would preserve the integrity of the data.

Cheers,

Jeff

Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Jim Nasby
Дата:
On 9/12/16 3:24 PM, Jeff Janes wrote:
> The man page you quote seems clear to me that setting it to 1, rather
> than leaving it at 0, makes the opportunity for corruption wider, not
> narrower.

Yeah, I actually read it backwards. :/ I don't see how --modify-window
is helpful at all here; you need to use --ignore-times.

> I thought that David's "-1" suggestions was tongue in cheek.  But it
> turns out that that actually does work.  Of course, it works by forcing
> every file to be copied, which removes the point of using this over
> pg_basebackup, but nonetheless it would preserve the integrity of the data.

AFAIK pg_basebackup blindly copies all data files, while rsync will
transfer only the parts of the files that have actually changed (see
--block-size). If the source and destination are on different servers,
that can mean less data transferred over the network.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


Re: 2.5TB Migration from SATA to SSD disks - PostgreSQL 9.2

От
Leonardo M. Ramé
Дата:
El 02/09/16 a las 18:28, Scott Marlowe escribió:
> On Thu, Sep 1, 2016 at 8:48 PM, Patrick B <patrickbakerbr@gmail.com> wrote:
> Why not just subscribe to another cluster on the master, then sub the
> slaves to that, then switchover to the new cluster on the master?
>

Maybe he  doesn't know how to do that, would you care to explain?

Regards,

--
Leonardo M. Ramé
Medical IT - Griensu S.A.
Av. Colón 636 - Piso 8 Of. A
X5000EPT -- Córdoba
Tel.: +54(351)4246924 +54(351)4247788 +54(351)4247979 int. 19
Cel.: +54 9 (011) 40871877