Обсуждение: pg_upgrade questions
Hello, I would like to ask a few questions about pg_upgrade when using --link option to avoid copying data files. 1. After running pg_upgrade but before starting new cluster, is it 100% safe to abort and go back to using the old cluster? For example: /usr/PostgreSQL/15/bin/pg_upgrade --link \ --old-datadir "/data/pgdb" \ --new-datadir "/data/pgdb.new" \ --old-bindir "/usr/PostgreSQL/9.6/bin" \ --new-bindir "/usr/PostgreSQL/15/bin" Upgrade succeeds but before I start the new cluster, I change my mind and want to revert to using the old cluster. Can I simply revert to using PostgreSQL 9.6 with these commands: rm -rf /data/pgdb.new mv /data/pgdb/global/pg_control.old /data/pgdb/global/pg_control Am I correct in thinking that apart from pg_control file, "pg_upgrade --link" does not modify the old cluster files in any incompatible ways? 2. Older pg_upgrade binary did not support --jobs argument. When upgrading from such older clusters, running the latest "pg_upgrade --jobs=N" fails at the very start. Is this expected? 3. I'm a bit confused about the correct procedure when it comes to upgrading primary and standby clusters. Looking at https://www.postgresql.org/docs/current/pgupgrade.html it mentions the following: "If you are upgrading standby servers using methods outlined in section Step 11, verify that the old standby servers are caught up by running pg_controldata against the old primary and standby clusters. Verify that the “Latest checkpoint location” values match in all clusters. (There will be a mismatch if old standby servers were shut down before the old primary or if the old standby servers are still running.) Also, make sure wal_level is not set to minimal in the postgresql.conf file on the new primary cluster." Does this mean pg_upgrade will not work on standby if it is out of sync with the primary? The page also talks about using rsync to copy data from primary to standby server. I would like to avoid doing large data copy. Are these upgrade steps likely to work: a) Disable replication temporary. b) Stop standby server and run pg_upgrade. c) Stop primary server and run pg_upgrade. d) Enable replication and start primary and standby and allow them to synchronize their data. Thanks.
On Wed, Nov 9, 2022 at 7:25 AM Sad Clouds <cryintothebluesky@gmail.com> wrote:
Hello, I would like to ask a few questions about pg_upgrade when using
--link option to avoid copying data files.
1. After running pg_upgrade but before starting new cluster, is it 100%
safe to abort and go back to using the old cluster? For example:
/usr/PostgreSQL/15/bin/pg_upgrade --link \
--old-datadir "/data/pgdb" \
--new-datadir "/data/pgdb.new" \
--old-bindir "/usr/PostgreSQL/9.6/bin" \
--new-bindir "/usr/PostgreSQL/15/bin"
Upgrade succeeds but before I start the new cluster, I change my mind
and want to revert to using the old cluster. Can I simply revert to
using PostgreSQL 9.6 with these commands:
rm -rf /data/pgdb.new
mv /data/pgdb/global/pg_control.old /data/pgdb/global/pg_control
Am I correct in thinking that apart from pg_control file, "pg_upgrade
--link" does not modify the old cluster files in any incompatible ways?
2. Older pg_upgrade binary did not support --jobs argument. When
upgrading from such older clusters, running the latest "pg_upgrade
--jobs=N" fails at the very start. Is this expected?
No, it is not expected. Whether parallelism is used or not should never affect the correctness of an upgrade or the state of the resultant cluster.
Does this mean pg_upgrade will not work on standby if it is out of sync
with the primary?
It will not, because once the new cluster launches there is no WAL that can fix any inconsistencies.
The page also talks about using rsync to copy data
from primary to standby server. I would like to avoid doing large data
copy. Are these upgrade steps likely to work:
a) Disable replication temporary.
b) Stop standby server and run pg_upgrade.
c) Stop primary server and run pg_upgrade.
d) Enable replication and start primary and standby and allow them to
synchronize their data.
I cannot tell if you are talking about physical or logical replication...but only the former matters for this purpose. I don't know why you'd want to disable replication, what you need to do is ensure that there is zero activity on the primary and that the standby has caught up applying all WAL that has already been generated (and no WAL is pending). Hence the requirement to check pg_controldata.
Step 11 covers your options. I don't suggest you invent your own.
rsync is the means by which you avoid "large data transfers".
David J.
On Wed, 9 Nov 2022 09:21:57 -0700 "David G. Johnston" <david.g.johnston@gmail.com> wrote: > I cannot tell if you are talking about physical or logical > replication...but only the former matters for this purpose. I don't know > why you'd want to disable replication, what you need to do is ensure that > there is zero activity on the primary and that the standby has caught up > applying all WAL that has already been generated (and no WAL is pending). > Hence the requirement to check pg_controldata. > > Step 11 covers your options. I don't suggest you invent your own. > > rsync is the means by which you avoid "large data transfers". > > David J. Hi David, thanks for your reply. Yes I was referring to physical replication. Step 11 in the pg_upgrade guide implies that the primary server has to be upgraded first, then rsync copies upgraded files and links to the standby server. For various reasons, it would be desirable for me to upgrade the standby server first, while the primary database is fully up and running. I don't have much control over the applications using the primary data base and cannot make them wait unspecified amount of time until no WAL is pending on the standby. The outage during database upgrade needs to be absolutely minimal, hence the use of --link option. I'm not an expert on PostgreSQL replication, so was hoping there was a way to upgrade standby, while primary is modifying data, then later upgrade primary and tell it to synchronize any new data to standby. Maybe there is a way to pause replication and leave both servers in a consistent state? Can any outstanding WAL on primary be automatically replayed to standby? But, if I understand you correctly, then this is not going to work at all. I may need to discard all data on the standby and restart replication from empty.